Skip to content

A QC pipeline for SVs calls based on coverage and SNP calls

License

Notifications You must be signed in to change notification settings

NCBI-Hackathons/ScrubSV

Repository files navigation

alt text

ScrubSV

A toolkit to detect and flag potentially false SV's based on SNP's and coverage. Hackathon team: Lead: Fritz Sedlazeck - SysAdmin: Steve Osazuwa - Programmers: Priya Krithivasan, Kshithija Nagulapalli, David Oliver, Seungyeul Yoo

How to use

Input requirements

  1. bam file
  2. Structural Variants in vcf format
  3. Gold Standard SV callset

Installation

git clone -r https://github.com/NCBI-Hackathons/ScrubSV.git
cd ScrubSV

Install HapCUT2

Refer to https://github.com/vibansal/HapCUT2 for installation instructions

Install SURVIVOR

Refer to https://github.com/fritzsedlazeck/SURVIVOR for installation instructions. Move the executable to the current working directory.

Dependencies

vcftools and Genomic Ranges

Install vcftools
sudo apt-get vcftools

Install GenomicRanges
source("https://bioconductor.org/biocLite.R")
biocLite("GenomicRanges")
library(GenomicRanges)

Slides from the hackathon

We created and presented this method over the NYGC NCBI Hackathon in August 2018. Here is the link to the slides

Software Workflow Diagram

We included a workflow of our pipeline describing the modules in the repo. Feel free to check it out.