You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed a case where the primary transcript for variant annotation would seemingly indicate that it matches the uniprot protein sequence, but in actuality it does not. For example, see the below screenshot for ZBTB3 variant at position 39 on the ensembl transcript, and seemingly indicates that position would also match uniprot Q9H5J0.
However, if you go to the actual uniprot entry Q9H5J0, the real position of the Arg residue is position 89, which matches the second ensembl transcript.
It seems like the correct uniprot id for the primary transcript is actually A0A6E1W9L1. My thought is that even if OC doesn't always use the canonical protein sequence from uniprot as its primary transcript, it should fix the table from suggesting that it is.
The text was updated successfully, but these errors were encountered:
@ctokheim Thanks for letting us know Collin. It turns out that Gencode V33 itself has wrong UniProt mappings, which were directly transferred to OpenCRAVAT's mapper. As I check the latest Ensembl 104, ENST to UniProt mapping for the two transcripts are correct as you described. One quick patch would be fixing just UniProt mapping, before releasing the mapper with a newer gene model. I'll see if the quick patch is possible.
I noticed a case where the primary transcript for variant annotation would seemingly indicate that it matches the uniprot protein sequence, but in actuality it does not. For example, see the below screenshot for ZBTB3 variant at position 39 on the ensembl transcript, and seemingly indicates that position would also match uniprot Q9H5J0.
However, if you go to the actual uniprot entry Q9H5J0, the real position of the Arg residue is position 89, which matches the second ensembl transcript.
It seems like the correct uniprot id for the primary transcript is actually A0A6E1W9L1. My thought is that even if OC doesn't always use the canonical protein sequence from uniprot as its primary transcript, it should fix the table from suggesting that it is.
The text was updated successfully, but these errors were encountered: