VPhaser output intepretation - strand bias p value #789

cromozome · 2018-02-09T21:17:34Z

Hello

I have three questions pertaining to the VPhaser output

How do I interpret the output below ??
Pos Var Cons Strd_bias_pval Type Var_perc SNP_or_LP_Profile
1425 A T 0.3718 snp 0.3173 A:3:3 C:3:1 G:1:1 T:1319:560
1426 A G 0.756 snp 0.2629 A:2:3 G:1332:561 T:1:3

Specifically, How is it that I get such different p values for such a close allelic diversity profile ?
In my limited understanding I interpret a 0.7 strand bias p value to be higher evidence of strand bias and 0.37 to be poor evidence of strand bias, but it seems difficult to explain with such close allele counts (I did try a 2*2 Fishers Test with the above values, but got different results for p value...wondering it its because VPhaser does BH correction.)

What does var_perc in the output mean ?
I am assuming file.nofdr.var.txt is the file that has both the strand bias corrections and fdr corrections.. please correct me if I am wrong

tomkinsc · 2018-02-09T22:20:55Z

The documentation for V-Phaser II can be found here:
http://software.broadinstitute.org/viral/docs/VPhaserII.pdf
From the docs, the output files are described:

{ReferenceName}.fdr.var.txt – the result of interest, where strand bias test + FDR (false
discovery rate) correction were used. {ReferenceName} is the name of the reference in
the input BAM file. Each variant entry consists of the
following: 
- the reference position (coordinate starts at 1)
- predicted variant
- consensus base
- strand-bias p-value
- type of variant (SNP or LP)
- frequency of the variant and the profile, where each entry consists of three values: the base, its count in the forward strand, and its count in the reverse strand, separated by colons.

{ReferenceName}.var.raw.txt – the raw variants without strand bias test.

{ReferenceName}.nofdr.var.txt – strand-bias test but no FDR correction

V-Phaser II was developed before my time at the Broad, so for methodological questions I'll point you to the original paper (10.1186/1471-2164-14-674), but perhaps @dpark01 can chime in. It does apply Benjamini-Hochberg correction, so perhaps that could explain the difference you're seeing in p-value.

cromozome · 2018-02-12T18:41:34Z

Thanks a lot Chris!
Just so I am interpreting the results correctly, in the lines below
1425 A T 0.3718 snp 0.3173 A:3:3 C:3:1 G:1:1 T:1319:560
1426 A G 0.756 snp 0.2629 A:2:3 G:1332:561 T:1:3

Am I correct in interpreting the above as ~ 37% chance of strand bias is the first row and ~ 75% chance of strand bias in the second row ?

cromozome · 2018-02-12T19:17:07Z

Also @tomkinsc would you recommend I rather use intrahost.py in the viral-ngs package ? (as I see it seems to achieve the same functionality)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VPhaser output intepretation - strand bias p value #789

VPhaser output intepretation - strand bias p value #789

cromozome commented Feb 9, 2018 •

edited

tomkinsc commented Feb 9, 2018

cromozome commented Feb 12, 2018

cromozome commented Feb 12, 2018

VPhaser output intepretation - strand bias p value #789

VPhaser output intepretation - strand bias p value #789

Comments

cromozome commented Feb 9, 2018 • edited

tomkinsc commented Feb 9, 2018

cromozome commented Feb 12, 2018

cromozome commented Feb 12, 2018

cromozome commented Feb 9, 2018 •

edited