Skip to content

TheNoyesLab/SNPCall_Benchmarking

Repository files navigation

SNPCall_Benchmarking

Benchmarking variant callers on simulated shotgun metagenomic data. Implementing a bioinformatic pipeline from synthesizing reads to alignment and variant calling.

Methods & Workflow



Figure 1. Workflow diagram showing the variant caller benchmarking process. First, select RefSeq genomes were chosen to simulate a metagenome and random mutations were added to the genomes to create a "gold standard" dataset. Then the number of genomes used and number of reads created were adjusted to evaluate the variant callers under a range of sample conditions.



Results





Important Directories

projects/SNP_Call_Benchmarking/Benchmarking_Run:

  • Directory containing synthetic reads, variant caller output, and benchmarks
  • All output of Benchmarking Workflow goes here

SNPCall_Benchmarking/Workflow:

Production-stage scripts that form the core Benchmarking workflow.

Genesis.sh

  • Script performing directory setup, SNP generation, read synthesis, and alignment

MultiCaller.sh

  • Script running variant callers and benchmarking on data generated by Genesis.sh

Single_scripts/

  • Core workflow broken up into individual scripts (read generation,alignment,individual variant callers, etc)

SNP_Injector_Fasta.py

  • Python code in Genesis.sh used to "inject" SNPs into fasta files
  • Creates a log of input SNPs and genome locations

About

Repo for scripts and files used in benchmarking variant callers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published