Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of xrefs from OBO #1378

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

bgyori
Copy link
Member

@bgyori bgyori commented May 5, 2022

This PR fixes and issue in integrating xrefs from OBO-derived resources and increases the number of cross-references in the bioontology.

@bgyori
Copy link
Member Author

bgyori commented May 6, 2022

Integrating xrefs between OBO-sourced IDs actually turns out to be problematic for several reasons:

  1. Several xrefs point to obsolete or non-existent entries in other ontologies
  2. Several xrefs are technically valid but simply incorrect, e.g., 'GO:GO:0140446' (fumigermin biosynthetic process) -> 'CHEBI:CHEBI:147341' (fumigermin)
  3. Some OBOs like MONDO put replaced-by relations as xrefs to entries in the ontology itself, these are currently picked up without further qualification as if they were normal xrefs. (example: MONDO:0014857, MONDO:0044630)
  4. There are non-trivial relationships with mappings from e.g., Biomappings that should be reconciled.

1 and 3 are relatively easy to address. I'm worried about 2, one potential solution being to restrict which namespaces we integrate mappings between to exclude e.g., GO-CHEBI.

@cthoyt
Copy link
Collaborator

cthoyt commented Jan 17, 2023

@bgyori can we revisit this? I think it will solve the issue I showed last friday on cogex that the MONDO term for asthma wasn't connected to the rest of the asthma terms with an xref relation

I agree that in your last comment, point 2 might difficult to overcome. Since most relations don't have any semantics ascribed to them except "database cross-reference", there are lots of kinds of things in there, including references for shadow terms. In https://gist.github.com/cthoyt/e13b270060a602830b9eb02c45f6b716, I checked this and found the issue is not widespread. There seem to be 5 between EFO/ChEBI and 3 between GO/ChEBI of this problem. We could potentially make PRs to these ontologies directly to fix, encode some additional logic (a short blacklist), or something else to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants