codinghedgehog/prephix
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Prephix (Pre-Phrecon Input fiXer) Usage: prephix.pl k28.out_file1 [k28.out_file2 ...] -batchid <id> [-debug] [-quiet] [-exclude <loci_file>] [-ignore_quality] Takes in a list of input files and generates one SNP loci output file (containing multiple strains), one reference base file, and a file containing the excluded indel entries. The indel file is formated as "STRAIN ID" [TAB] k28|nuc|vcf [TAB] ...excluded VAAL K28, VCF, or NUCMER line... So prephix basically dumps the excluded lines as-is into the indel file, but prefixes each line with the strain id and then either k28 for VAAL formatted line, nuc for NUCMER formatted line, or vcf for VCF formatted line. The SNP loci and reference base files are suitable as input to the phrecon (Phylo Reconstructor) utility (i.e. tab delimited columnar format: STRAIN_ID [TAB] LOCI [TAB] BASE). The SNP loci file and indel files are suitable input for the SNP Swapper utility. Recognized input file formats: k28.out (VAAL), NUCMER, VCF You may mix input file types within a single run/batch. === ADDITIONAL SCRIPTS === Additional scripts exist in this repository, mainly utility and helper scripts that work on the outfiles from prehix. --- Prephix 2 SNP Effect Converter --- Usage: prephix2snpeffect.py ref_file snp_file This script takes the ref and snp output files from a prephix run and generates an SNP Effect compliant input file. SNP Effect file format is: SNPlocus<tab>sample=snpbase<tab>ref=refbase SNP Effect doesn't care about strain ids so if two strains share a locus but have different bases, then two lines will be generated, with identical locus and ref values but different sample values. --- Prephix to PhenoLink Generator --- Usage: pre2phe.py -ref <ref base file> -snp <snp file> -out <PhenoLink output file> Takes in a one SNP loci output file (possibly containing multiple strains) and one reference base file generated from prephix, and generates a PhenoLink export file. The PhenoLink export file is a tab-delimited file, with columns of all Loci Names (in this case RefBase_Locus_SnpBase) and rows of StrainIDs with 1's indicating the presence of that Locus Name in its strain, and 0 otherwise. --- SNP Comparator --- Usage: snp_compare.py --snpfile <snpfile> --compare <compare strain 1> <compare strain 2> ... [--exclude <exclude strain 1> <exclude strain 2> ...] [--outfile <filename>] Takes in an SNP file (formatted like the snp file output from prephix) and a list of strain IDs to compare and/or exclude. SNP Comparator finds all the common SNP Loci between the compared Strains and then removes the SNP Loci that are common across the exclude Strains.
About
Prephix (Pre-Phrecon Input fiXer) -- This is a script that takes in a list of input files and generates one SNP loci output file (containing multiple strains), one reference base file, and a file containing the excluded indel entries.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published