-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structural variant vcf annotation #97
Comments
What sort of annotation would you hope to see? Names of spanned genes? +?
|
yes, basically annotation with a genomic feature (gene, exon/intron, UTR etc) and the possible consequence on gene expression (eg if frameshift in exon happens, or exon deletion/duplication/inversion watever) VEP handles SV vcf, so it's annotation can be taken as an example |
https://github.com/Illumina/Nirvana
might also be relevant?
…On Wed, Feb 16, 2022 at 9:19 AM EugeneEA ***@***.***> wrote:
yes, basically annotation with a genomic feature (gene, exon/intron, UTR
etc) and the possible consequence on gene expression (eg if frameshift in
exon happens, or exon deletion/duplication/inversion watever)
VEP handles SV vcf, so it's annotation can be taken as an example
—
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA6TETXXSD4XE725BX7WSLU3OXAXANCNFSM5ORRX5XQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
--
--
Mike Cariaso
http://www.cariaso.com
|
Probably, I have not tried it |
@EugeneEA Hi, yes there is a plan to add the support for SV, CNV, etc. in the future. Can be in this repo or a fork. |
@rkimoakbioinformatics Thanks for the answer, but as far as I understand it is not a near future, but plans for the further development? |
@EugeneEA I would like to start discussion on it. Can you let me know what kind of output columns you would need? Something like the following?
For imprecise structural variants, would you still want to see predicted protein sequence change? |
@rkimoakbioinformatics sorry for long delay, yes that would be sufficient for starters defenetly. The tricky part probably the filed "transcript_ablation" etc. maybe an additinal column should be added here, for example listing the deleted (exons) etc? |
@EugeneEA Thanks. Below is a sketch. The current format of
A VCF format specification document has a few structural variant examples:
Turning this into something like:
Of course, Would something like the above work for your purposes? Any feedback/suggestion would be appreciated. |
@rkimoakbioinformatics thankt a lot for the replies! Ok, that looks a bit too verbouse, can we select the major transcript as we do for the SNPs? Sequence ontology is extremely usefull field but it's aggregation in VEP (https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html) quite simplify the filtering, may be it is something worth the implementation (also for SNPs). Would these variants be annotated if they are present in some of the annotators (clinvar for example) (I know that these are basically indels, but still might be usefull) |
@EugeneEA Yes, if the variants are in ClinVar as well as any other OpenCRAVAT, they will be annotated. I am not sure yet about how imprecise variants are treated in annotation data sources, but that will be the spirit. As far as I know, VEP outputs sequence ontologies for each transcript on separate lines in its native output format, or on the same line delimited in the VCF format. I am not aware of aggregation by VEP - does it aggregate? If you mean, by aggregation, something like showing all sequence ontologies from all the variants for a transcript together, that has been planned but we haven't gotten to work on it yet. By selecting a major transcript, you mean the current OpenCRAVAT's style of showing the mutation consequence on a representative transcript, either a MANE one or a custom choice for a gene, and that on all the other transcripts where the variant falls in another column? |
@rkimoakbioinformatics Yes, that is exectly what I meant either consequence or MANE, and the rest goes to other column |
Hi EugeneEA, Just to catch you up, Rick Kim is no longer on the OpenCRAVAT team, but we are actively developing structural variant mapping and annotations. We'd appreciate if you might share other possible features that would interest you in addition to your comments in early 2022. |
Hi! Nothing above what was mentioned earlier so far, but in general, it would be super helpful if your SV support will follow the same frame as usual snp/INDELS module in terms of possibility of adding custom annotators. For examples - we are analyzing a lot of samples with some SV detection tools and currently I have to annotate each new sample with the SV frequency from internal database using VEP + custom scripts. I'd love to switch to oc for both tasks. Therefore for me a "VEP aggregation" column (or set of columns which I can use as a secondary input for such annotator) and possibility to add custom annotator is a mast. Best, Eugene |
Hi, I've come across the problem that oc does not annotate SV vcf's are there plans to support SV in a future or maybe thereis a workaround?
the common line format:
chr1 964964 20 N <DEL> 137.6 . SVTYPE=DEL;SVLEN=-366;END=965330;STRANDS=+-:10;IMPRECISE;CIPOS=-30 ...etc
Best, Eugene
The text was updated successfully, but these errors were encountered: