Skip to content

kveeramah/Dstat_vcf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Dstat_vcf

This python script takes a vcf file and performs D_statistic analysis for a bunch of samples direct from a vcf

Only biallelic SNP loci are considered. When encountering a heteroygote, a random allele is chosen (i.e. pseudo haploid).

numpy and pyvcf must be installed in the version of python used.

vcf files must be bgzipped and tabix indexed. Reference genome must also be indexed The pop_test files must have the space seperated fields in the following order: W, X, Y, Z (see qpdstat manual in Admixtools)

Default block size for delete m_i jacknife if 5MB and qual score threshold for including a site is 30. You can change these variables in the python script

Usage is ./Dstat.vcf.py <in.vcf.gz> <pop_test_file> <reference_genome> <out_stem> <nb_threads>

Written (poorly) by Krishna Veeramah (krishna.veeramah@stonybrook.edu)

About

Python script to calculate D-statistics directly from VCF

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages