Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between ivar consensus and ivar variants #103

Open
IsabelFE opened this issue Jun 14, 2021 · 5 comments
Open

Inconsistency between ivar consensus and ivar variants #103

IsabelFE opened this issue Jun 14, 2021 · 5 comments

Comments

@IsabelFE
Copy link

IsabelFE commented Jun 14, 2021

I already mentioned this in #85 (comment), but I am opening its own issue because I think this is a separate problem.

There is an inconsistency between ivar consensus and ivar variants results when the same settings are used:

samtools mpileup -d 5000 -q 30 -f wuhan_trim3polyA.fasta "sample.sorted.bam" > "sample.pile"

< "sample.pile" | ivar consensus -t 0.6 -m 10 -p "sample.consensus.fa"
< "sample.pile" | ivar variants -t 0.6 -m 10 -r wuhan_trim3polyA.fasta -g GCF_009858895.2_ASM985889v3_genomic.gff -p "sample.variant.tsv"

samtools mpileup -d 5000 -q 30 -B -f wuhan_trim3polyA.fasta "sample.sorted.bam" > "sample.B_pile"

< "sample.B_pile" | ivar consensus -t 0.6 -m 10 -p "sample.B_consensus.fa"
< "sample.B_pile" | ivar variants -t 0.6 -m 10 -r wuhan_trim3polyA.fasta -g GCF_009858895.2_ASM985889v3_genomic.gff -p "sample.B_variant.tsv"

The problem occurs around indels, for example based on the bam files on IGV it looks like there is and 4nts indel here:
Screen Shot 2021-06-14 at 11 31 56 AM.

When looking at the fasta files, both the sample.B_consensus.fa and sample.consensus.fa (with and without the flag -B) files have a 4nts indel. But when looking at the corresponding variant.tsv files, only the one with the -B flag identifies the indel.

@IsabelFE IsabelFE reopened this Jun 14, 2021
@charlesfoster
Copy link

Just chiming in, did you see this in the user guide?

"Note: Please use the -B options with samtools mpileup to call variants and generate consensus. When a reference sequence is supplied, the quality of the reference base is reduced to 0 (ASCII: !) in the mpileup output. Disabling BAQ with -B seems to fix this. This was tested in samtools 1.7 and 1.8."

Perhaps this is related to the discrepancy you see here. Not sure why it would be inconsistent between variants and consensus though.

@PovilasMat
Copy link

We run into same issue of disparities between variants and consensus. Even with -B the same issue persists in some situations.

@IsabelFE
Copy link
Author

@charlesfoster, yeah I've tried with and without the -B. The problem is the inconsistency between ivar consensus and ivar variants. In my experience looking at bam files, I think that the ivar variants calls are more accurate than what gets generated on the ivar consensus fasta file. @gkarthik, would it be possible to input the ivar variants output into ivar consensus?
As many other people, we are using ivar consensus to generate SARS-CoV-2 genome assemblies, and I think that this could improve the quality of GISAID genome submissions and pangolin lineage assignment.

Thanks

@charlesfoster
Copy link

charlesfoster commented Jun 29, 2021

@IsabelFE what I've ended up doing in some circumstances now is taking the output of ivar variants and incorporating them into a consensus genome using bcftools. To do so, I first convert the ivar output tsv into a vcf file, then bgzip and tabix that file. Finally, I extract regions of low depth using bedtools, ensure they don't overlap with inferred deletions by using a combination of bcftools query and bedtools subtract, then use bcftools consensus with a depth mask and 'include expression' to incorporate the variants I'm confident in based on depth, frequency etc. I've outlined the working steps here: samtools/bcftools#1386 (comment) (just substitute the bcftools vcf file for the ivar vcf, and adjust the vcf tags accordingly, e.g. instead of FORMAT/VAF you use FORMAT/ALT_FREQ)

@pmenzel
Copy link

pmenzel commented Nov 5, 2021

This might be related to #97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants