Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building ECTO #238

Open
davidshumway opened this issue Sep 5, 2022 · 13 comments
Open

Building ECTO #238

davidshumway opened this issue Sep 5, 2022 · 13 comments

Comments

@davidshumway
Copy link

davidshumway commented Sep 5, 2022

How are sub-ontologies within ECTO merged? Is there documentation, wiki, or an article describing this process?

Per documentation, ECTO integrates the following ontologies:

Ontologies used in composition (largely orthogonal):

  • Exposure Ontology (ExO) - used as the upper ontology, for based classes such as 'exposure', different routes such as 'ingestion'
  • Chemical Entities of Biological Interest (CHEBI) - use for both entities and roles
  • Environment Ontology (ENVO) - environmental materials, processes
  • NanoParticle Ontology (NPO) - radiation
  • Relations Ontology (RO) - relations
  • Phenotypic Quality Ontology (PATO) - qualities
  • UBERON Anatomy Ontology - tissue types (not used yet)
  • NCI Thesaurus (NCIT) - activities such as smoking
  • Sustainable Development Goals Interface Ontology (SDGIO) - social entities
  • Population and Community Ontology (PCO) - population attributes (e.g. overcrowding)

Was there a manual process of selecting classes of interest from each sub-ontology? Or were the sub-ontologies included in whole? Or were ECTO classes simply built from concepts found in the sub-ontologies (e.g. using DOSDPs), and by so doing relevant concepts in the sub-ontologies were thus integrated into ECTO?

@cmungall
Copy link
Member

cmungall commented Sep 6, 2022 via email

@davidshumway
Copy link
Author

davidshumway commented Sep 6, 2022

Thanks, @cmungall!

Npo no longer needed radiation is in Envo

Yes... (echo "!!!!!NPO currently skipped!")

While /src/ontology/ecto.Makefile is certainly readable anyone approaching the code for the first time (here) might be a bit hard-pressed to understand the merging and DP templating processes. I assume most of the merging and templating is run either there or in src/ontology/Makefile.

I see that src/ontology/Makefile uses the ODK. In terms of merging, perhaps reviewing ODK is a good place to start? (e.g. here)

@cmungall
Copy link
Member

cmungall commented Sep 6, 2022

Can I just get a bit of context - why do you need to understand the Makefile? I think for this project it is generated by the ODK so yes understanding ODK would help, and I encourage all forms of learning about the ODK! But really you only need to know this if you are working on the ECTO release pipeline...

@davidshumway
Copy link
Author

The context would be adding new terms to ECTO as well as using ECTO as part of another domain-specific ontology.

The domain is waterborne illnesses in the context of recreation or occupation. For example, swimming at the beach might lead to an exposure to e-coli. The planned source ontologies are related to illnesses, recreational and occupational activities, the environment (ENVO), environmental exposure (ECTO), sources of pollution, chemicals/organisms of interest in the domain.

@davidshumway
Copy link
Author

davidshumway commented Sep 6, 2022

So just to clarify: in regard to adding new terms I think that's already documented in ECTO. And in terms understanding the merging process that is more in regard to simply understanding ECTO for my own purposes because it seems well built.

@cmungall
Copy link
Member

cmungall commented Sep 6, 2022

Got it, that helps, thanks!

I'm hoping that we will soon have docs derived from the DOSDPs, analogous to this:

https://mondo.readthedocs.io/en/latest/editors-guide/patterns/

We would then have discussions about ontology dependencies at a per-pattern level, and there would be (hopefully) clear transparent guidelines

If you use DOSDPs in your ontology it may make it easier to reuse patterns as well

@davidshumway
Copy link
Author

davidshumway commented Sep 10, 2022

This is more of a discussion topic rather than issue.

Is there discussion in ECTO regarding use of exposure outcomes (e.g. https://ontobee.org/ontology/ExO?iri=http://purl.obolibrary.org/obo/ExO_0000003)? For our use case outcomes to exposures are also of interest. Something like this perhaps:

exo:exposure_outcome
  increased / risk / (hpo:diarrhea, hpo:hemolytic-uremic syndrome, hpo:vomiting,…)

exo:is_associated_with

exo:exposure_event # for example, the following…
  
  exposure to / (ecoli, enterococcus,…) / via / exo:route (ingestion, skin contact, eye contact,…) / 
  during / (swimming, wading, beachgoing,…)
  
  ecto:3000009 (exposure to ecoli) / during / (swimming, wading, beachgoing,…)
  
  # ecto:3000009
  'exposure event' and ('has exposure route' some ingestion) and
  ('has exposure stimulus' some 'Escherichia coli')

  'exposure event' and
  ('has exposure route' some ingestion and ('has exposure route' some [prefix]:swimming)) and
  ('has exposure stimulus' some 'Escherichia coli')

From a high level, patterns in ECTO relate two or more source ontology entities, as defined in a template, by creating new entities and associated axioms via the templates. The associated source ontology entities are then included via a ROBOT extract process. Does this capture the high-level idea?

Are there any classes or relations specific to ECTO rather than taken from source ontologies and if so what is the procedure for adding these?

Are there any organizational changes made to the ontology after merging, for example, combining classes or reorganizing hierarchies?

Is there a ROBOT diff included showing pre- and post-reasoning?

@davidshumway
Copy link
Author

davidshumway commented Sep 10, 2022

In terms of disease outcomes perhaps these could be realized in Mondo? For example,

Chan, L. E., Vasilevsky, N. A., Thessen, A., Matentzoglu, N., Duncan, W. D., Mungall, C. J., & Haendel, M. A. (2021). A Semantic Model Leveraging Pattern-based Ontology Terms to Bridge Environmental Exposures and Health Outcomes. Proceedings http://ceur-ws. org ISSN, 1613, 0073.

So in regard to swimming, something like this?

[Mondo diseases] "realized in response to"
  some (exposure to [bacteria/virus/fungi/parasite] via ingestion during swimming)
# e.g.
"Typhoid fever" "realized in response to"
  some (exposure to Salmonella Typhi via ingestion during swimming)
"Giardiasis" "realized in response to"
  some (exposure to Giardia duodenalis via ingestion during swimming)

In regard to the exposure event, something like this perhaps?

"exposure event" and
  ("has exposure stimulus" some [Salmonella Typhi]) and 
  ("has exposure route" some [ingestion]) and
  ("has exposure route" some [swimming]) 

@davidshumway
Copy link
Author

davidshumway commented Sep 12, 2022

I was also wondering whether ECTO has plans / ideas around multiple routes as I see there are routes defined for some exposures. For example, from food_ingestion.tsv:

defined_class defined_class_name stressor
ECTO:0070000 exposure to probiotic or bacteria supplement via ingestion FOODON:03401308

For my use case, simply adding swimming to any exposure is the basic idea, e.g., exposure to Escherichia coli during swimming (i.e. using ECTO:3000009). Or going further exposure to Escherichia coli during swimming via ingestion or perhaps exposure to Escherichia coli during swimming (via ingestion or via skin contact).

Just saw this pattern as well...

defined_class defined_class_name stressor medium route
ECTO:0080000 exposure to arsenic in water via ingestion CHEBI:27563 ENVO:00002006 ExO:0000056
'exposure event' and
  ('has exposure route' some ingestion) and
  ('has exposure transport path' some 'liquid water') and 
  ('has exposure stimulus' some 'arsenic atom')

So to summarize...
Stressor: NCBITaxon:562 (ecoli)
Medium: (ENVO_00002149: seawater / various forms of water)
Route: (ExO:0000056: ingestion / various other routes e.g. skin contact)
[...]: [..., e.g. swimming @ NCIT_C94738] (swimming / wading / floating / diving)

And for swimming, perhaps it's similar to e.g. Exposure to walking (ECTO_6000003):

'exposure event' and ('has exposure stimulus' some Walking)

So that would make swimming and seawater two mediums e.g.

'exposure event' and
  ('has exposure stimulus' some NCIT_C94738) and
  ('has exposure transport path' some ENVO_00002149)

@laurenechan
Copy link
Collaborator

These are all very interesting discussion points @davidshumway , thank you for bringing them up!

I wonder what the additional value would be to add the component of 'swimming' to the terms specifically. Are you hoping to have some way to aggregate exposures that occur while swimming? Or is this just adding further details to a term like exposure to E coli in water via ingestion ? I'm not sure I quite view swimming as a route if the individual was exposed via ingestion of the water making the exposure similar to if you had E coli in a cup of water and ingested it. If the individual was simultaneously exposed to E coli via ingestion and absorption somehow, that could be a thought to discuss if the simultaneous routes of exposures occur. Does that need to be coined specifically as via swimming though or could we offer some kind of compound exposure routes that clarify multiple routes of exposure occurring at once (so it would be the same exposure route if you were swimming in the water versus you ran through a sprinkler with contaminated water spraying on you that may have also been consumed)
I am not sure if 'swimming' itself really tells us anything unique that would modify the exposure in a way that alters the outcome, but if it does I please feel free to elaborate as I may just be missing that! Also, it would be interesting to discuss if swimming terms have a viable use case and if they could be well reused by other community members or if it would make more sense to include the swimming component as an annotation on a less specific term.

As for the question regarding outcomes of an exposure, I agree with your thought that Mondo or another relevant ontology would be the most appropriate place to include those outcomes if not already represented. Outcomes do not fit within the scope of ECTO in my opinion, but certainly having the outcomes and necessary relations modeled is of importance! Are there particular outcomes you are looking for? Mostly ones related to organisms in water?

@davidshumway
Copy link
Author

davidshumway commented Sep 12, 2022

Thanks, @laurenechan.

I think swimming exposure to ecoli is a little bit unique in that a common source of ecoli in swimming waters such as at a beach would be wastewater treatment discharge whereas from a sprinkler or tap water would be less likely (?) to include this source.

Does that need to be coined specifically as via swimming though or could we offer some kind of compound exposure routes that clarify multiple routes of exposure occurring at once (so it would be the same exposure route if you were swimming in the water versus you ran through a sprinkler with contaminated water spraying on you that may have also been consumed)

I was definitely thinking of using compound exposures and wonder if this may be a suggested approach. E.g. the compound of Exposure to E coli via ingestion, Exposure to seawater, Exposure to swimming, Exposure to wastewater pollution. Not sure if this is the compound you're suggesting?

As for the question regarding outcomes of an exposure, I agree with your thought that Mondo or another relevant ontology would be the most appropriate place to include those outcomes if not already represented. Outcomes do not fit within the scope of ECTO in my opinion, but certainly having the outcomes and necessary relations modeled is of importance! Are there particular outcomes you are looking for? Mostly ones related to organisms in water?

Related to organisms, fungi, viruses, parasites, plants. Perhaps 50-100 relevant entities. For outcomes, diseases or symptoms, so e.g. e-coli infection, abdominal pain, etc.

@davidshumway
Copy link
Author

davidshumway commented Sep 15, 2022 via email

@laurenechan
Copy link
Collaborator

@davidshumway this is a good question and likely one that would need some thought from both the ECTO and Mondo teams (and potentially other ontology groups as well depending on use cases). For diseases that have an infectious agent component, the relation in Mondo is more concrete, indicating that the infectious agent is in fact the source for the disease. Whereas within ECTO, an exposure to an infectious agent is not necessarily going to result in the infectious disease starting. Likely there is a confirmed exposure to the agent prior to the individual becoming ill, but there is no assurance that all exposures will equal disease. This one directional relation would be the case if we were to use 'infectious disease X' 'realized in response to' 'exposure to (agent)'. @nicolevasilevsky might have a better sense of what Mondo thinks is a good plan for using or not using ECTO terms in this kind of use case.

An ongoing discussion about including exposures to infectious agents in ECTO vs other ontologies is occurring, and is a discussion point on another issue as well #134 , where we still don't have a firm decision for whether exposures to agents should be housed in ECTO vs elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants