New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epoprostenol used to treat rats #2243
Comments
Bill stated it more elegantly than I did. Do we/can we employ domain and range constraints to avoid this kind of thing: |
The KG2 API does actually filter out edges that violate such domain/range specifications, but they're still in the underlying KG2c graph, which xDTD is trained on (I think). Maybe those edges should be excluded from the graph used for training? They're easily identifiable by the |
Do we need a fix for this in the Lobster release? Hoping the answer is no, and that we can instead aim to fix this in the Octopus release? |
I'm not sure that I am informed enough to have an opinion about whether or not we should include edges with |
Hi @edeutsch and @saramsey, I think both solutions (1. use filtered KG to train xDTD; 2. add a filter to the xDTD outputs) work for this issue. However, I will say option 2 will be easier and more flexible considering the long training time of xDTD. For option 1, are we sure that the edges with |
Adding a filter seems fine to me - and I take back my statement that those edges should be removed from the training dataset specifically, ha - I don't know enough about xDTD to know whether that would make sense. But I agree with Steve that at least the results that
I think @saramsey or @sundareswarpullela or @acevedol know more about this than me, but from what I can tell, I think it's only SemmedDB edges that are marked as |
@chunyuma since it takes so long to re-train xDTD, what about the following path forward:
|
Sure, I can add a filter to the xDTD output. Can I know where I can find the edge attribute |
Huh, that's weird. I see it in my copy of KG2.8.4c:
Also note that currently the values for |
Thanks @amykglen! I will check it again. |
Hi team, I have already updated the xDTD database for KG2.8.4 to exclude all edges with |
Thanks @amykglen. Now the updated xDTD database has passed the |
I was assigned this issue by TAQA:
NCATSTranslator/Feedback#707
Apparently xDTD was trained with KG2-SemMedDB that asserts that Epoprostenol is used to treat rats. And there are lots of papers describing treatment of rats with Epoprostenol. But apparently this is not an appreciated answer.
It is unclear to me whether we just want to remove such SemMedDB edges in KG2
Or whether the xDTD training data can be refined to exclude Drug-treats-X edges where X is a species.
Or whether this problem goes away on its own with the upcoming KG2 "treats" refactor. (where I assume we should make an effort to ensure that ideas like:
Drug X was used to attempt to treat disease Y in species Z
are NOT excoded as:
Drug X treats species Z
Anyone have ideas on how to handle the TAQA issue?
The text was updated successfully, but these errors were encountered: