Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniprot annotation in OC can be misleading #79

Open
ctokheim opened this issue Oct 20, 2021 · 1 comment
Open

Uniprot annotation in OC can be misleading #79

ctokheim opened this issue Oct 20, 2021 · 1 comment
Assignees

Comments

@ctokheim
Copy link

I noticed a case where the primary transcript for variant annotation would seemingly indicate that it matches the uniprot protein sequence, but in actuality it does not. For example, see the below screenshot for ZBTB3 variant at position 39 on the ensembl transcript, and seemingly indicates that position would also match uniprot Q9H5J0.

image

However, if you go to the actual uniprot entry Q9H5J0, the real position of the Arg residue is position 89, which matches the second ensembl transcript.

image

It seems like the correct uniprot id for the primary transcript is actually A0A6E1W9L1. My thought is that even if OC doesn't always use the canonical protein sequence from uniprot as its primary transcript, it should fix the table from suggesting that it is.

@rkimoakbioinformatics
Copy link
Contributor

@ctokheim Thanks for letting us know Collin. It turns out that Gencode V33 itself has wrong UniProt mappings, which were directly transferred to OpenCRAVAT's mapper. As I check the latest Ensembl 104, ENST to UniProt mapping for the two transcripts are correct as you described. One quick patch would be fixing just UniProt mapping, before releasing the mapper with a newer gene model. I'll see if the quick patch is possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants