Skip to content

Releases: szpiech/selscan

Unphased statistics

22 Oct 20:51
Compare
Choose a tag to compare

Introducing unphased versions of iHS, nSL, XP-EHH, and XP-nSL. Use with --unphased flag. See ZA Szpiech (2021) for details. Normalize as you would with the unphased statistics.

ZA Szpiech (2021) selscan 2.0: scanning for sweeps in unphased data. biorxiv doi: doi:10.1101/2021.10.22.465497.

v1.3.0

22 May 20:48
Compare
Choose a tag to compare

20MAY2020 - selscan v1.3.0 - Log ratios are now output as log10 not natural logs (beware comparisons with raw selscan computations from versions prior to v1.3.0). New statistics implemented.

--pmap <bool>: Set this flag to use physical distance instead of genetic map

Introduction of XP-nSL, this statistic is a cross population statistic for identifying hard/soft sweeps. Does not require a genetic map. XP-nSL:nSL::XP-EHH:iHS. Cite ZA Szpiech et al. (2020) High-altitude adaptation in rhesus macaques. bioRxiv doi: https://doi.org/10.1101/2020.05.19.104380

--xpnsl <bool>: Set this flag to calculate XP-nSL.
Default: false

Normalize XP-nSL with --xpnsl flag in norm.

lasugden adds the option to calculate XP-EHH with either definition of EHH. By default, uses original denominator (N choose 2). To use denominator defined in Wagh et al. for better performance on incomplete sweeps, use flag --wagh

--wagh <bool>: Set this flag to calculate EHH with Wagh denominator. For xpehh only. DO NOT use with --alt
Default: false

Normalize these computations with --xpehh flag in norm.

norm v1.3.0 - Now supports --xpnsl flag, which is identical to using --xpehh.
--qbins now has a default value of 10 instead of 20.
--bp-win analyses have been changed when analyzing XP-EHH and XP-nSL scores. Since positive scores suggest adaptation in the first (non-ref) population and negative scores suggest adaptation in the second (ref) population, we split windows into those enriched for extreme positive scores and those enriched for extreme negative scores.
min and max scores are given for each window for XP statistics, and the max |score| is reported for iHS and nSL stats.

*.windows output files therefore have additional columns:

For XP stats:
<# scores in win>

For iHS and nSL:
<# scores in win>

selscan v1.2.0 norm v1.2.1

25 Aug 18:17
Compare
Choose a tag to compare

25AUG2017 - norm v1.2.1 released to fix a crash when --nsl flag is used. This release contains the same selscan v1.2.0 as released in July.

1.2.0

18 Jul 20:57
Compare
Choose a tag to compare

Support for iHH12 calculations. norm has --ihh12 and --nsl flags. Fixed buggy --crit-percent flag in norm binary. Fixed misleading error messages when --trunc-ok used.

--skip-low-freq on by default

12 Feb 18:08
Compare
Choose a tag to compare

--skip-low-freq is now on by default. The flag no longer functions, but for the time being won't throw an error if used (just a warning). If you wish to include low frequency variants in the construction of your haplotypes use --keep-low-freq.

Updates to norm so that it can handle output from selscan when --ihs-detail is used.

28 Oct 18:11
Compare
Choose a tag to compare

28OCT2015 - Updates to norm so that it can handle output from selscan when --ihs-detail is used.

Calculate nSL plus option for more detailed output for iHS

15 Jun 20:00
Compare
Choose a tag to compare

15JUNE2015 - Release of 1.1.0. tomkinsc adds the --ihs-detail parameter which, when provided as an adjunct to --ihs, will cause selscan to write out four additional columns to the output file of iHS calculations (in order): derived_ihh_left, derived_ihh_right, ancestral_ihh_left, and ancestral_ihh_right.

An example file row follows, with header added for clarity.

locus phys-pos 1_freq ihh_1 ihh_0 ihs derived_ihh_left derived_ihh_right ancestral_ihh_left ancestral_ihh_right

16133705 16133705 0.873626 0.0961264 0.105545 -0.0934761 0.0505176 0.0456087 0.0539295 0.0516158

From these values we can calculate iHS, but it is preserved in the output for convenience. Having left and right integral information may assist certain machine learning models that gain information from iHH asymmetry.

selscan can now calculate the nSL statistic described in A Ferrer-Admetlla, et al. (2014) MBE, 31: 1275-1291. Also introduced a check on map distance ordering. Three new command line options.

--nsl : Set this flag to calculate nSL.
Default: false

--max-extend-nsl : The maximum distance an nSL haplotype is allowed to extend from the core.
Set <= 0 for no restriction.
Default: 100

--ihs-detail : Set this flag to write out left and right iHH scores for '1' and '0' in addition to iHS.

VCF support

06 May 20:11
Compare
Choose a tag to compare

13MAY2015 - norm v1.0.5 is released. norm will now normalize ihs or xpehh scores. Two new command line options.

--ihs : Do iHS normalization.

--xpehh : Do XP-EHH normalization.

Exactly one of these must be specified when running norm (e.g. ./norm --ihs --files *.ihs.out or ./norm --xpehh --files *.xpehh.out).

06MAY2015 - Added basic VCF support. selscan can now read .vcf and .vcf.gz files but without tabix support. A mapfile is required when using VCF. Two new command line options.

--vcf : A VCF file containing haplotype data.
A map file must be specified with --map.

--vcf-ref : A VCF file containing haplotype and map data.
Variants should be coded 0/1. This is the 'reference'
population for XP-EHH calculations and should contain the same number
of loci as the query population. Ignored otherwise.

Mean pairwise sequence difference

17 Oct 21:28
Compare
Choose a tag to compare

17OCT2014 - Release of 1.0.4. A pairwise sequence difference module has been introduced. This module isn't multithreaded at the moment, but still runs quite fast. Calculating pi in 100bp windows with 198 haplotypes with 707,980 variants on human chr22 finishes in 77s on the test machine. Using 100kb windows, it finishes in 34s. Two new command line options.

--pi : Set this flag to calculate mean pairwise sequence difference in a sliding window.
Default: false

--pi-win : Sliding window size in bp for calculating pi.
Default: 100

XPEHH fixed, gzip support

15 Sep 19:08
Compare
Choose a tag to compare

Release of 1.0.3. A critical bug in the XP-EHH module was introduced in version 1.0.2 and has been fixed in 1.0.3. Do not use 1.0.2 for calculating XP-EHH scores. Thanks to David McWilliams for finding this error. 1.0.3 also introduces support for gzipped input files. You may pass hap.gz, map.gz. and tped.gz files interchangably with unzipped files using the same command line arguments. A new command line option is available.

--trunc-ok : If an EHH decay reaches the end of a sequence before reaching the cutoff,
integrate the curve anyway (iHS and XPEHH only).
Normal function is to disregard the score for that core.
Default: false