Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCBI genes map to ensembl genes with invalid identifiers #94

Open
dhimmel opened this issue Dec 11, 2020 · 1 comment
Open

NCBI genes map to ensembl genes with invalid identifiers #94

dhimmel opened this issue Dec 11, 2020 · 1 comment

Comments

@dhimmel
Copy link

dhimmel commented Dec 11, 2020

I've noticed three genes where the value for ensembl.gene does not begin with ENSG:

https://mygene.info/v3/gene/263?fields=ensembl
ensembl.gene appears to actually be ENSG00000237801
{"_id": "263", "_version": 1, "ensembl": {"gene": "263", "transcript": "263-1", "translation": [], "type_of_gene": "rRNA"}}

https://mygene.info/v3/gene/55872?fields=ensembl
ensembl.gene appears to actually be ENSG00000168078
{"_id": "55872", "_version": 3, "ensembl": {"gene": "55872", "transcript": "55872-1", "translation": [], "type_of_gene": "tRNA"}}

https://mygene.info/v3/gene/126231?fields=ensembl
ensembl.gene appears to actually be ENSG00000189144
{"_id": "126231", "_version": 2, "ensembl": {"gene": "126231", "transcript": "126231-1", "translation": [], "type_of_gene": "tRNA"}}

In these cases, it seems the value for ensembl.gene has been set to entrezgene (the ncbigene id). Any ideas what the problem is?

@kevinxin90
Copy link
Contributor

This issue is introduced when we're integrating Metazoa Species data from Ensembl through BioMart.

File path: ensembl_metazoa/49/gene_ensembl__gene__main.txt
text based search: awk '$2 == "263" { print $0 }' gene_ensembl__gene__main.txt
returns: 27923 263 rns 3153 3520 Mt 1 rRNA

And since no entrezgene id can be mapped to it. We use it as the _id. And it accidentally aligns with the genedoc with _id:263 from entrez for human species.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants