Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing Wikipedia URLs in MyGene.info #71

Open
gtsueng opened this issue Aug 7, 2019 · 1 comment
Open

missing Wikipedia URLs in MyGene.info #71

gtsueng opened this issue Aug 7, 2019 · 1 comment
Assignees

Comments

@gtsueng
Copy link
Contributor

gtsueng commented Aug 7, 2019

Currently, about ~1500 human genes in Wikidata have corresponding Wikipedia URLs in English Wikipedia, but MyGene.info does not return these results (ie- it's missing in MyGene).

For example, NRN1 (https://www.wikidata.org/wiki/Q18040171) is linked to https://en.wikipedia.org/wiki/NRN1 in Wikipedia, but MyGene.info (https://mygene.info/v3/gene/51299?fields=wikipedia%2C%20symbol) does not give the url.

SPARQL query in python for pulling human genes with English Wikipedia links in this gist

python notebook for pulling the missing urls

@newgene
Copy link
Member

newgene commented Aug 7, 2019

@gtsueng yes, wikipedia data source in mygene.info is not automatically updated. It's still the initial version we loaded at the time we created the parser for it.

It's probably good timing to get the proper "dumper" (for auto pulling data from the src) setup for Wikipedia data source, based on the code snippet you provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants