Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set --named-classes-only for reduce by default #1043

Open
matentzn opened this issue Apr 7, 2024 · 9 comments
Open

Set --named-classes-only for reduce by default #1043

matentzn opened this issue Apr 7, 2024 · 9 comments

Comments

@matentzn
Copy link
Contributor

matentzn commented Apr 7, 2024

This is a big change, but imo makes sense.

https://robot.obolibrary.org/reduce is great at ensuring that our classification is non redundant, eg.

A sub B
B sub C
A sub C

is reduce to

A sub B
B sub C

Which corresponds to the "transitive reduction" (hence reduce).

Now this is super cool and important, but unfortunately for some use cases

A sub B
A sub R some D
B sub R some D

will also be reduced, to:

A sub B
B sub R some D

This is because A sub R some D is implied by the other two axioms.

This is nice if your goal is to reduce redundancy, but not often very user friendly, especially for people that wish to translate the ontology into KGs, or users of the JSON and OBOgraphs release files.

I am suggesting to add a parameter to the config:

reduce: named-only

and have it switched on by default. Objections?

@gouttegd
Copy link
Contributor

If you’re looking for feedback you should probably “advertise” this kind of proposed changes and requests for comment on other channels than just the issue tracker – most notably the #ontology-development-kit channel on the OBO Slack. I assume many ODK users (including those who may have an opinion about such a change) do not routinely monitor the issue tracker.

(FWIW, I have no objection myself to the proposed change, as long as (1) it is documented and (2) people who want to keep the current behaviour have a way to do so.)

@matentzn
Copy link
Contributor Author

Thank you @gouttegd! Did announce it in slack.

@Clare72
Copy link
Contributor

Clare72 commented Apr 19, 2024

I would vote for mirroring the name of the robot option rather than changing slightly.
i.e. reduce: named-classes-only
Also nervous about just changing a default like this, I think maybe it should be opt-in, i.e. users need to add the above to their -odk.yaml if they want it.

@matentzn
Copy link
Contributor Author

I would vote for mirroring the name of the robot option rather than changing slightly.
i.e. reduce: named-classes-only

100% I agree;

Also nervous about just changing a default like this, I think maybe it should be opt-in, i.e. users need to add the above to their -odk.yaml if they want it.

I am concerned that if we dont change the default, this will not change, but tbh, I am not entirely wedded to the "default". Lets see what others think.

Thanks @Clare72 for your feedback!

@cmungall
Copy link
Contributor

I don't understand the rationale for the change. I'm strongly in favor of keeping the default.

I don't even see the scenario described arising because such redundant axioms shouldn't even be asserted in the ontology to begin with. If they are they should be removed

@matentzn
Copy link
Contributor Author

The rationale is this:

Imagine a phenotype ontology with two classes:

Abnormal heart morphology and increased heart size. Imagine them being subclass related. Now, how will a normal user ever be able to query for "all phenotypes affecting the heart"?

You will either need a reasoner or a relation graphed version of the ontology which essentially materialises the "affects heart" relationship.

I thought I was acting on your behalf here @cmungall, making base variant self contained and usable without the need of reasoning.. seems not 😬

@cmungall
Copy link
Contributor

You need a reasoner either way, or at least simple graph walking. Imagine there is a subclass of AHM that doesn't have a logical definition, and a subclass of that subclass. You need to walk up the graph to find the existential restriction.

In fact this kind of structure is absolutely the norm for most ontologies. See:

INCATools/ontology-access-kit#739

@cmungall
Copy link
Contributor

For the record I am not against reasoning. This is why every semsql release comes with a precomputed entailed_edge table. This is both necessary and sufficient for the vast majority of non-trivial use cases involving ontologies. Distribution of this table (or even better, distribution of a semsql sqlite/duckdb database) with every OBO ontology should be the norm.

@matentzn
Copy link
Contributor Author

Ok, so I guess form my side, the question was whether the more graph-shaped obographs-json / obo-format releases et al should get the existential restrictions fully materialised percolated now the hierarchy or not, but it seems like this introduces too much redundancy then..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants