You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we have found a bug in the reference/allele count calculation that occurs when a homozygous and a heterozygous SNP are adjacent to each other. Here the screenshot:
This is the VCF line obtained from FreeBayes:
chr2 215632255 . CA TG,CG 3038.7 . AB=0.550459,0.422018;ABP=5.42083,8.76769;AC=1,1;AF=0.5,0.5;AN=2;AO=60,46;CIGAR=2X,1M1X;DP=109;DPB=109;DPRA=0,0;EPP=27.4756,21.8927;EPPR=3.0103;GTI=0;LEN=2,1;MEANALT=3,3;MQM=60,60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1,1;PAIREDR=1;PAO=0,0;PQA=0,0;PQR=0;PRO=0;QA=2270,1724;QR=75;RO=2;RPL=26,19;RPP=5.32654,6.03148;RPPR=7.35324;RPR=34,27;RUN=1,1;SAF=17,14;SAP=27.4756,18.305;SAR=43,32;SRF=1;SRP=3.0103;SRR=1;TYPE=mnp,snp;technology.ILLUMINA=1,1 GT:DP:RO:QR:AO:QA:GL 1/2:109:2:75:60,46:2270,1724:-320.576,-135.081,-123.163,-179.972,0,-172.268
Note that the RO/AO annotations (bold) are correct.
For the annotation with information from different databases (1000g, ExAC, etc.), we now need to decompose the two SNPs. We tried that with different orders of vcfbreakmulti and vcfallelicprimitives (with option -kg).
The output of first vcfallelicprimitives then vcfbreakmulti is:
chr2 215632255 . C T 3038.7 . AB=0.550459;ABP=5.42083;AC=1;AF=0.5;AN=2;AO=60;CIGAR=2X;DP=109;DPB=109;DPRA=0;EPP=27.4756;EPPR=3.0103;GTI=0;LEN=1;MEANALT=3;MQM=60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=2270;QR=75;RO=2;RPL=26;RPP=5.32654;RPPR=7.35324;RPR=34;RUN=1;SAF=17;SAP=27.4756;SAR=43;SRF=1;SRP=3.0103;SRR=1;TYPE=snp;technology.ILLUMINA=1 GT:DP:RO:QR:AO:QA:GL 1|0:109:2:75:60,46:2270,1724:-320.576,-135.081,-123.163,-179.972,0,-172.268
chr2 215632256 . A G 3038.7 . AB=0.422018;ABP=8.76769;AC=2;AF=1;AN=2;AO=46;CIGAR=1M1X;DP=109;DPB=109;DPRA=0;EPP=21.8927;EPPR=3.0103;GTI=0;LEN=1;MEANALT=3;MQM=60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=1724;QR=75;RO=2;RPL=19;RPP=6.03148;RPPR=7.35324;RPR=27;RUN=1;SAF=14;SAP=18.305;SAR=32;SRF=1;SRP=3.0103;SRR=1;TYPE=snp;technology.ILLUMINA=1 GT:DP:RO:QR:AO:QA:GL 1|1:109:2:75:60,46:2270,1724:-320.576,-135.081,-123.163,-179.972,0,-172.268
Here, the variants are correctly decomposed in a herozygous variant and a homozygous variant, but the RO/AO annotations are wrong.
The output of first vcfbreakmulti then vcfallelicprimitives is:
chr2 215632255 . C T 3038.7 . AB=0.550459;ABP=5.42083;AC=1;AF=0.5;AN=2;AO=60;CIGAR=2X;DP=109;DPB=109;DPRA=0;EPP=27.4756;EPPR=3.0103;GTI=0;LEN=1;MEANALT=3;MQM=60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=2270;QR=75;RO=2;RPL=26;RPP=5.32654;RPPR=7.35324;RPR=34;RUN=1;SAF=17;SAP=27.4756;SAR=43;SRF=1;SRP=3.0103;SRR=1;TYPE=snp;technology.ILLUMINA=1 GT:DP:RO:QR:AO:QA:GL **.|**1:109:2:75:60:2270:-320.576,-135.081,-123.163
chr2 215632256 . A G 3038.7 . AB=0.550459;ABP=5.42083;AC=1;AF=0.5;AN=2;AO=60;CIGAR=2X;DP=109;DPB=109;DPRA=0;EPP=27.4756;EPPR=3.0103;GTI=0;LEN=1;MEANALT=3;MQM=60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=2270;QR=75;RO=2;RPL=26;RPP=5.32654;RPPR=7.35324;RPR=34;RUN=1;SAF=17;SAP=27.4756;SAR=43;SRF=1;SRP=3.0103;SRR=1;TYPE=snp;technology.ILLUMINA=1 GT:DP:RO:QR:AO:QA:GL .|1:109:2:75:60:2270:-320.576,-135.081,-123.163
chr2 215632256 . A G 3038.7 . AB=0.422018;ABP=8.76769;AC=1;AF=0.5;AN=2;AO=46;CIGAR=1M1X;DP=109;DPB=109;DPRA=0;EPP=21.8927;EPPR=3.0103;GTI=0;LEN=1;MEANALT=3;MQM=60;MQMR=60;NS=1;NUMALT=2;ODDS=261.153;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=1724;QR=75;RO=2;RPL=19;RPP=6.03148;RPPR=7.35324;RPR=27;RUN=1;SAF=14;SAP=18.305;SAR=32;SRF=1;SRP=3.0103;SRR=1;TYPE=snp;technology.ILLUMINA=1 GT:DP:RO:QR:AO:QA:GL .|1:109:2:75:46:1724:-320.576,-179.972,-172.268
Here, the homzygous variant is split into two heterozygous variants and the RO anntation is wrong.
Is there a way to generate correct counts (we need it to compute the real AF because the given AF is always 0.5 or 1.0).
Hi,
we have found a bug in the reference/allele count calculation that occurs when a homozygous and a heterozygous SNP are adjacent to each other. Here the screenshot:
This is the VCF line obtained from FreeBayes:
Note that the RO/AO annotations (bold) are correct.
For the annotation with information from different databases (1000g, ExAC, etc.), we now need to decompose the two SNPs. We tried that with different orders of
vcfbreakmulti
andvcfallelicprimitives
(with option -kg).The output of first vcfallelicprimitives then vcfbreakmulti is:
Here, the variants are correctly decomposed in a herozygous variant and a homozygous variant, but the RO/AO annotations are wrong.
The output of first vcfbreakmulti then vcfallelicprimitives is:
Here, the homzygous variant is split into two heterozygous variants and the RO anntation is wrong.
Is there a way to generate correct counts (we need it to compute the real AF because the given AF is always 0.5 or 1.0).
A working example including the BAM file and a Makefile can be found here:
vcflib_bug_allele_counts.zip
Best,
Marc
The text was updated successfully, but these errors were encountered: