Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized biotype/effect #349

Open
hosseinvk opened this issue Sep 22, 2023 · 2 comments
Open

Unrecognized biotype/effect #349

hosseinvk opened this issue Sep 22, 2023 · 2 comments

Comments

@hosseinvk
Copy link

hosseinvk commented Sep 22, 2023

Hi,

I have run vcf2maf using vep 108, while the runs complete, it prints large number of warnings as following:

WARNING: Unrecognized biotype "protein_coding_CDS_not_defined". Assigning lowest priority!
WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!

Would this be a concern, given that vcfs were annotated with VEP earlier?
Thanks for advice.

@gianfilippo
Copy link

Hi,

same issue here. I updated VEP and cache to 107.
I am getting lots of
WARNING: Unrecognized biotype "protein_coding_LoF". Assigning lowest priority!
WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!

Can you please advice ?
Thanks

@Teezi
Copy link

Teezi commented Dec 5, 2023

Hi guys,
I'm also experiencing the same problem (please see the "warning content" below), but I'm happy to find that it's not a major issue -- As the newer version of Ensembl (v110 in my case) contains new biotypes/effects that are not yet included in vcf2maf.

For silencing those warnings, my suggestion is to add those "Unrecognized biotype/effect" into the source code under %biotype_priority or %effectPriority sections,
e.g., by checking the meaning of protein_coding_CDS_not_defined, we know that: protein_coding_CDS_not_defined means Replaces the “processed_transcript” transcript biotype in protein_coding genes, so we can prioritise protein_coding_CDS_not_defined similar to processed_transcript.

For the corresponding meanings of biotypes and effects, please refer here:

  1. Gene/Transcript Biotypes in GENCODE & Ensembl,
  2. Ensembl Variation - Calculated variant consequences

Hope it helps,
Cheers!

---------------------- Here is the problem ------------------
I've used Ensembl VEP cache version 110, and got 90+ lines of warnings, and these warnings can be summarised into following types:

WARNING: Unrecognized biotype "protein_coding_CDS_not_defined". Assigning lowest priority!
WARNING: Unrecognized biotype "protein_coding_LoF". Assigning lowest priority!
WARNING: Unrecognized effect "splice_donor_5th_base_variant". Assigning lowest priority!
WARNING: Unrecognized effect "splice_donor_region_variant". Assigning lowest priority!
WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@gianfilippo @hosseinvk @Teezi and others