Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcfanno fails to parse header of bcf file #154

Open
edg1983 opened this issue Apr 12, 2023 · 2 comments
Open

vcfanno fails to parse header of bcf file #154

edg1983 opened this issue Apr 12, 2023 · 2 comments

Comments

@edg1983
Copy link

edg1983 commented Apr 12, 2023

Hello,

I'm using vcfanno 0.3.3 to annotate a bcf file that has undergone several previous processing steps, including bcftools fill-tags, bcftools filter, bcftools csq.

I'm using a toml config file that worked fine before so I assume this is OK.

Essentially, it seems vcfanno has problems parsing the header of the input bcf as you can see from the error log below

=============================================
vcfanno version 0.3.3 [built with go1.16.5]

see: https://github.com/brentp/vcfanno
=============================================
vcfanno.go:116: found 30 sources from 12 files
vcfanno.go:157: falling back to non-bgzip
vcfanno.go:164: error parsing VCF query file molisani_cohort.PASS.snpEff.bcf: FILTER error: ##FILTER=<ID=PASS,Description="All filters passed",IDX=0>. [line: 2]
INFO error: ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency estimate for each alternate allele",IDX=1>, []. [line: 7]
INFO error: ##INFO=<ID=AQ,Number=A,Type=Integer,Description="Allele Quality score reflecting evidence for each alternate allele (Phred scale)",IDX=2>, []. [line: 8]
INFO error: ##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes",IDX=3>, []. [line: 9]
INFO error: ##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes",IDX=4>, []. [line: 10]
FILTER error: ##FILTER=<ID=MONOALLELIC,Description="Site represents one ALT allele in a region with multiple variants that could not be unified into non-overlapping multi-allelic sites",IDX=5>. [line: 11]
FORMAT error: ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype",IDX=6>. [line: 12]
FORMAT error: ##FORMAT=<ID=RNC,Number=2,Type=Character,Description="Reason for No Call in GT: . = n/a, M = Missing data, P = Partial data, I = gVCF input site is non-called, D = insufficient Depth of coverage, - = unrepresentable overlapping deletion, L = Lost/unrepresentable allele (other than deletion), U = multiple Unphased variants present, O = multiple Overlapping variants present, 1 = site is Monoallelic, no assertion about presence of REF or ALT allele",IDX=7>. [line: 13]
FORMAT error: ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)",IDX=8>. [line: 14]
FORMAT error: ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed",IDX=9>. [line: 15]
FORMAT error: ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality",IDX=10>. [line: 16]
FORMAT error: ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled genotype Likelihoods",IDX=11>. [line: 17]
INFO error: ##INFO=<ID=MULTIALLELIC_INDEL,Number=0,Type=Flag,Description="Variant is part of a multi-allelic variant including at least one indel",IDX=12>, []. [line: 2598]
INFO error: ##INFO=<ID=MULTIALLELIC_SNV,Number=0,Type=Flag,Description="Variant is part of a multi-allelic variant including only SNVs",IDX=13>, []. [line: 2599]
FORMAT error: ##FORMAT=<ID=AB,Number=A,Type=Float,Description="Allele balance for the ALT allele",IDX=14>. [line: 2602]
INFO error: ##INFO=<ID=F_MISSING,Number=.,Type=Float,Description="Added by +fill-tags expression F_MISSING=F_MISSING",IDX=15>, []. [line: 2603]
INFO error: ##INFO=<ID=median_GQ,Number=.,Type=Float,Description="Added by +fill-tags expression median_GQ=MEDIAN(GQ)",IDX=16>, []. [line: 2604]
INFO error: ##INFO=<ID=median_DP,Number=.,Type=Float,Description="Added by +fill-tags expression median_DP=MEDIAN(FMT/DP)",IDX=17>, []. [line: 2605]
INFO error: ##INFO=<ID=nhomalt,Number=.,Type=Float,Description="Added by +fill-tags expression nhomalt=COUNT(GT=\"AA\")",IDX=18>, []. [line: 2606]
INFO error: ##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of samples with data",IDX=19>, []. [line: 2607]
INFO error: ##INFO=<ID=MAF,Number=1,Type=Float,Description="Frequency of the second most common allele",IDX=20>, []. [line: 2608]
INFO error: ##INFO=<ID=HWE,Number=A,Type=Float,Description="HWE test (PMID:15789306); 1=good, 0=bad",IDX=21>, []. [line: 2609]
INFO error: ##INFO=<ID=TYPE,Number=.,Type=String,Description="Variant type",IDX=22>, []. [line: 2610]
INFO error: ##INFO=<ID=ExcHet,Number=A,Type=Float,Description="Test excess heterozygosity; 1=good, 0=bad",IDX=23>, []. [line: 2611]
FILTER error: ##FILTER=<ID=lowAQ,Description="Set if true: AQ < 20",IDX=24>. [line: 2614]
FILTER error: ##FILTER=<ID=noHQvars,Description="Set if true: N_PASS(GQ >= 20 & FMT/DP >= 10) == 0",IDX=25>. [line: 2617]
FILTER error: ##FILTER=<ID=highMissing,Description="Set if true: F_MISSING > 0.9",IDX=26>. [line: 2619]
FILTER error: ##FILTER=<ID=lowGQ,Description="Set if true: median_GQ < 10",IDX=27>. [line: 2621]
FILTER error: ##FILTER=<ID=SNVlowDP,Description="Set if true: TYPE == \"snp\" && median_DP < 6",IDX=28>. [line: 2623]
FILTER error: ##FILTER=<ID=INDELlowDP,Description="Set if true: TYPE == \"indel\" && median_DP < 10",IDX=29>. [line: 2625]
FILTER error: ##FILTER=<ID=NoAltCalls,Description="Set if true: AC == 0",IDX=30>. [line: 2627]
INFO error: ##INFO=<ID=BCSQ,Number=.,Type=String,Description="Local consequence annotation from BCFtools/csq, see http://samtools.github.io/bcftools/howtos/csq-calling.html for details. Format: Consequence|gene|transcript|biotype|strand|amino_acid_change|dna_change",IDX=31>, []. [line: 2633]
FORMAT error: ##FORMAT=<ID=BCSQ,Number=.,Type=Integer,Description="Bitmask of indexes to INFO/BCSQ, with interleaved first/second haplotype. Use \"bcftools query -f'[%CHROM\t%POS\t%SAMPLE\t%TBCSQ\n]'\" to translate.",IDX=31>. [line: 2634]

I've attached the actual header of the file if this can help you understand the issue.

Thanks a lot!

header.txt

@brentp
Copy link
Owner

brentp commented Apr 12, 2023

Hi @edg1983 ,
vcfanno doesn't support BCF. I should add a better message for that, but it will resolve your issue for now.

@edg1983
Copy link
Author

edg1983 commented Apr 12, 2023

Oh I didn't realized that!
I got confused by the error message.

Easy fix then! ;)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants