Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving ambiguity in multiallelic sites #155

Open
e-271 opened this issue Aug 15, 2023 · 0 comments
Open

Resolving ambiguity in multiallelic sites #155

e-271 opened this issue Aug 15, 2023 · 0 comments

Comments

@e-271
Copy link

e-271 commented Aug 15, 2023

I am using the 'by_alt' op, but have many multiallelic sites in my input vcf. For a position where some alleles are present in annotations and others are not, there is ambiguity about which alleles have annotations and which do not. For example if the alt allele 'A' is present in the annotation file and 'G' is not, vcfanno will produce the following:
chr1 91246 1_91246_T_G T G,A 2896.0 . AC=9,1;PHRED=5.179;CADD=0.377863 GT:AD:DP:GQ:PL 0/1:16,21,12:52:85:681,0,494,644,85,1010 0/2:33,0,14:49:95:95,184,663,0,479,441

(I used SNP alt alleles for the example which generally will always be present in the CADD annotation files, but many sites in my input files have a mix of SNPs and indels so I am not sure how to resolve the ambiguity there).

I think this might be improved if vcfanno output a placeholder like '.' for alleles that do not have annotations. The above example would become:
chr1 91246 1_91246_T_G T G,A 2896.0 . AC=9,1;PHRED=.,5.179;CADD=.,0.377863 GT:AD:DP:GQ:PL 0/1:16,21,12:52:85:681,0,494,644,85,1010 0/2:33,0,14:49:95:95,184,663,0,479,441

Here are my example files to replicate this. Thank you!
testVcfannoMultiallele.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant