GitHub - fifdick/wgsPIPE: Bash pipeline for whole-genome sequencing analysis or pre-processing.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
jobscripts		jobscripts
README		README
cnv_template.sh		cnv_template.sh
config_wgs.xml		config_wgs.xml
general_job.sh		general_job.sh
split_vcf.sh		split_vcf.sh
wgs_pipe.sh		wgs_pipe.sh
wgs_pipe.sh~		wgs_pipe.sh~
wrap_cnv.pl		wrap_cnv.pl
wrap_wgs.sh		wrap_wgs.sh

Repository files navigation

Code to old wgs pipeline.
Developed to work in the secure environment of tsd (https://www.uio.no/english/services/it/research/sensitive-data/) and their setup at that time (2018) using Slurm Workload Manager "SBATCH" to send jobs to clusters. Could then preprocess fasta files of multiple samples in parallel.

This file is trying to keep track of all scripts that are in this directory. What their job is and how they are being used.

################################
WGS PIPELINE:
################################

includes the following scripts:
wgs_pipe.sh
wrap_wgs.sh
config_wgs.xml

All steps that are needed to analyse NGS data (wgs) are included as functions in the wgs_pipe script.
To see how those funcitons are called look at the help():
> ./wgs_pipe.sh -h
Functions include:
-trimming
-alignment(bwa mem, indexing dedupping, recalibrating bases)
-variant calling with the haplotype caller
-variant recalibration, see help for which parameters can be set
-variant calling (joint calling and merging, see help for details

All tools and reference data that are being used in this pipeline are specified the config_wgs.xml file and can be changed accordingly (might cause failures along the way when newer versions of tools change syntax)

wgs_pipe.sh is a bashscript and should be called by:
wrap_wgs.sh which is a bash - sbatch script which specifies the resources used on colossus.
this script is called with
> sbatch wrap_wgs.sh
wrap_wgs.sh should be edited for each job accordingly. It needs specifications on which input files, outputdirectories and so on
It is called and adjusted sample_wise.
Also see /cluster/projects/p94/fdi/jobscripts/wrap_varcall.sh
for another example of this kind of file. (This file only calls functions for variant calling of the wgs_pipe.sh)

################################
CNV PIPELINE:
################################
includes the following scripts:
cnv_template.sh
cnv_pipe.pl

works with a different principle.
Here the main script is a perl script:
> perl cnv_pipe.pl inputsample_1 [..] inputsample_n
this perl script generates a jobscript which will be a bash script for each sample that is being analysed.
For this script generation it uses the template file: cnv_template.sh
And invokes
> sbatch jobscript.sh
after script generation.
It spreads multiple samples to run in parallel on /cluster.
The difference to wgs_pipe.sh is, that it has a wrapper script (perl) that makes it possible to automatically generate jobscripts for all samples
to make them run in paralell on colossus, whereas in WGS PIPELINE, you need to manually generate your verison of wrapper script for each sample.
But in contrast to wgs_pipe, cnv_pipe has all job steps (cnvnator steps and erds command) within the sbatch script...which might not be so good after all.

#################################
other
##################################
this includes:
general_job.sh
split_vcf.sh

general_job.sh is a sbatch jobscript with no special specificications and just calls a command or script with max 2 further input parameters.

>sbatch general_job.sh <cmd> <param1> <param2>

for example the split_vcf.sh script, which is a short script to split multi vcf files to one_sampled vcf files. since
this is not a sbatch script it can not be executed on colossus. to do this you can call it with the general_job.sh script like so:

>sbatch general_job.sh split_vcf.sh /vcf_input/dir /output/dir

About

Bash pipeline for whole-genome sequencing analysis or pre-processing.

Readme

Activity

0 stars

1 watching

0 forks

Report repository

Releases

No releases published

Packages

No packages published

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jobscripts

jobscripts

README

README

cnv_template.sh

cnv_template.sh

config_wgs.xml

config_wgs.xml

general_job.sh

general_job.sh

split_vcf.sh

split_vcf.sh

wgs_pipe.sh

wgs_pipe.sh

wgs_pipe.sh~

wgs_pipe.sh~

wrap_cnv.pl

wrap_cnv.pl

wrap_wgs.sh

wrap_wgs.sh

Repository files navigation

About

Releases

Packages

Languages

fifdick/wgsPIPE

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages