Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGT missing SNPs #12

Open
biojiangke opened this issue May 14, 2018 · 1 comment
Open

BGT missing SNPs #12

biojiangke opened this issue May 14, 2018 · 1 comment

Comments

@biojiangke
Copy link

Running BCFTOOLS and BGT for the same region, BCFTOOLS showed two SNPs but BGT (bgt built from the same vcd.gz) returned none. Not sure how widespread this problem would be. Could someone take a look at this?

Examples like following:

bcftools view -r 05:67948215-67948219 xxx.vcf.gz | grep -v "#" | wc -l
2
bgt view -r 05:67948215-67948219 -s sample.list xxx.bgt | grep -v "#" | wc -l
0

sample.list includes all sample names from the VCF header.

@zmaroti
Copy link

zmaroti commented Sep 16, 2021

I believe that the difference is due to that "bgt import" ONLY imports entries if the FILTER field is "PASS"

I had seen that in the XXX.bgt.bcf you are lacking all the variants that has VQSRTrancheXXX in the FILTER field of the original VCF. When I count only PASS variants the bgt data has slightly more entries (due to splitting multiallelic variants to atomic).

So THERE SHOULD BE HUGE WARNING at the import manual that ONLY PASS variants are lifted.
Remember FILTER, INFO (and may be ID?) VCF fields are cleared, so there is no way to distinguish between valid and invalid variants thus it seems logical to import only variants that are PASS.

I know technically both rs number (usually placed in ID) and the FILTER (PASS, etc) information could be placed into the variant annotation fmf file as an extra tag, however it wouldn't save space, and at the moment the included javascript does not lift it. On the contrary it would be nice if the VCF output could be more standard conformant and have these meaningful fields kept (included in the bgt bcf and queriable like you can query region etc)

Zoltan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants