Skip to content

7. Other pipeline variables II

d-j-e edited this page Jun 7, 2016 · 11 revisions

##Other pipeline variables (cont'):

###"minimum_depth"
Minimum depth of reads for variant calling. (SAMtools)

Default value:

minimum_depth = 5

###"HetsVCF"
The pipeline filters out heterozygous SNP calls. To capture these SNPs in the form of a VCF (one per isolate), set HetsVCF to 'True'.

_e.g._
``` HetsVCF = False (default: no VCF produced) ``` Or:
``` HetsVCF = True (VCF produced)
```
###"cover_fail"
If the percentage of the reference replicon mapped is below this threshold, the isolate will be called a 'fail' and removed from downstream SNP analysis.

Default value:
``` cover_fail = 50% ```
###"depth_fail"
If the average read count for any reference replicon falls below this threshold value, the isolate will be called a 'fail' and remove from downstream SNP analysis.

Default value:
``` depth_fail = 10 ```
###"mapped_fail"
If the percentage of total reads mapped to the replicon falls below this threshold value, the isolate will be called a 'fail' and removed from downstream SNP analysis.

Default value:
``` mapped_fail = 50% ```
###"sd_out"
For any isolate, if the SNP count is more than sd_out\*s.d. from the mean count, the isolate will be called as an outgroup (phylogeny run only).

Default value:
``` sd_out = 2 ```
So for the default value, any isolate with greater than the mean SNP count + 2\*s.d. will be called as an outgroup.

###"strand_bias_cutoff"
Check to see there are reads in both directions to support any SNP call.
I.e. if ABS(DP4[2]-DP4[3]) / (DP4[2]+DP4[3]) < strand_bias_cutoff, include the SNP.
Set to >= 1 to turn strand bias filtering off.
e.g. (default)
``` strand_bias_cutoff = 0.8 ```
###"check_reads_mapped"
To switch on or off the checking of percentage of reads mapped, use the following 'check_reads_mapped'.

By default (_i.e._ set to the null string, "") the pipeline will use the largest replicon.
If it is set to "off", there will be no check for percentage of reads mapped. Otherwise, give list of the n replicons to be checked, followed by an 'x', then followed by the ratio of the first n-1 replicons. For a single replicon, just put the replicon.

_e.g._
``` check_reads_mapped = "" ``` Or:
``` check_reads_mapped = "off" ``` Or:
``` check_reads_mapped = "rep_1" ``` Or:
``` check_reads_mapped = "rep_1,rep_2,rep_3,x,0.45,0.3" ``` _i.e._ In the last example, rep1 is 45% of the total genome, rep2 is 30% of the total genome, and rep3 is 25% of the total genome (by default).

Note: there must be no spaces in the list.

###"conservation"
During allele matrix filtering, you can set the conservation level for missing alleles. This is a ratio between 1.0 (100% conservation - remove all SNPs with even one missing allele call) and 0.0 (0% conservation - remove no SNPs). By default, the pipeline produces the 95% and 0% conservation matrices, with downstream analysis on the 95% matrix.

By entering a different conservation level (_e.g._ 0.85), both the 95% and 0% matrices will still be produced, but so too will the 85% matrix (in this example), and downstream analysis carried out on this latter matrix.

Default value:
``` conservation = 0.95 ```
###"DifferenceMatrix"
The pipeline can produce a difference matrix as optional output. Currently this is a pairwise difference count. To get the difference matrix, set ‘DifferenceMatrix’ to 'True'.

_e.g._
``` DifferenceMatrix = False (default: No difference matrix produced) ``` Or:
``` DifferenceMatrix = True (difference matrix output produced) ```
[Previous](https://github.com/katholt/RedDog/wiki/6.-Other-pipeline-variables-I#other-pipeline-variables) [Home](https://github.com/katholt/RedDog/wiki/1.-Instruction-Manual#reddog-v049---instruction-manual) [Next](https://github.com/katholt/RedDog/wiki/8.-Advanced-Settings#advanced-settings)