Viper workflow

Viper is a Snakemake workflow, aimed at performing the RNA-seq workflow of the paper 'Causes and Consequences of A Glutamine Induced Normoxic HIF1 Activity for the Tumor Metabolism', Kappler et al. (2019) in a reproducible, automated, and partially contained manner. It is implemented such that alternative or similar analysis can be added or removed.

Viper consists of a Snakefile (workflow/HIF_version_1.0/snakefile), conda environment files (envs/*.yaml), a configuration file (workflow/HIF_version_1.0/config.yaml), a set of R functions (R/*R), and a set of R scripts (scripts/*.R), to perform quality control, preprocessing, differential expression analysis, and functional annotation of RNA-seq data.

By default, the pipeline performs all the steps shown in the diagram below. However, advanced user, you can easily modify the Snakefile and the config.yaml and/or add "custom rules" to enable additional functions. Currently, transcript quantification with Salmon at the read-level or gene quantification by featureCounts can be activated.

Workflow graph

This workflow performs differential expression analysis on paired-end RNA-seq data. After adapter removal with Cutadapt and quality filtering with sickle, reads were mapped with STAR to the humane genome (GRCh38.82), and transcript counts were quantified with salmon. These transcript counts were summarized to gene counts with tximport. Integrated normalization and differential expression analysis were conducted with edegR. Further, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID v6_8 ) for functional annotation of the differential expressed genes.

Setup the VIPER workflow

Assuming that snakemake and conda are installed (and your system has the necessary libraries to compile R packages), you can use the following commands on a test dataset:

0. Step - clone the githup repository

git clone https://github.com/GrosseLab/ViperWF.git

1. Step - Set up the needed folder and copy files from `viper/workflow/HIF_version_1.0`

Folder and File Structure

Here is the basic suggested skeleton for your project folder:

  .
  ├── data
  │   ├── qPCR 	            # qPRCR raw data
  │   └ *.fastq.gz 	        # all 'fastq.gz'-files from !...!
  │
  ├── references
  │   └── hg38 	    				      # all data from Homo_sapiens.GRCh38.82
  │       ├ Homo_sapiens.GRCh38.82.gtf 	    				          # annotation
  │       ├ Homo_sapiens.GRCh38.dna.primary_assembly.fa 	        # genome sequence 
  │       └ Homo_sapiens.GRCh38.82.EXON.fa 	    				      # exon sequence of all transcript of GTF
  │	
  ├── logs
  ├── report
  │
  ├── viper 	    				      # Github repository 
  │   ├── report 	    				      # Snakemake report definition
  │   ├── wrapper 	    				      # Snakemake wrapper
  │   ├── rules 	    				      # Snakemake rules
  │   ├── scripts 	    				      # Snakemake scripts
  │   ├── workflow 	    				      # Snakemake final workflows
  │   │	  └ HIF_version_1.0 	    				      #
  │   ├── R 	    				      # R functions needed to run the analysis   
  │   └── man 	    				      # R functions manual
  │
  ├── Snakefile 	    				      # file from ./viper/workflow/HIF_version_1.0
  ├── config.yaml 	    				      # file from ./viper/workflow/HIF_version_1.0
  ├── units.tsv 	    				      # file from ./viper/workflow/HIF_version_1.0
  └── samples.tsv 	    				      # file from ./viper/workflow/HIF_version_1.0

Make folder and copy files from viper/workflow/HIF_version_1.0

mkdir data
mkdir data/qpcr
mkdir references
mkdir logs
mkdir report

cp ./viper/workflow/HIF_version_1.0/Snakefile
cp ./viper/workflow/HIF_version_1.0/config.yaml
cp ./viper/workflow/HIF_version_1.0/units.tsv
cp ./viper/workflow/HIF_version_1.0/samples.tsv

cp ./viper/workflow/HIF_version_1.0/copy.csv ./data/qPCR/
cp ./viper/workflow/HIF_version_1.0/qPCR_data.csv ./data/qPCR/

2. Step - Download data -- will be soon available

Download data from Gene Expression Omnibus (GEO) project GSExxx using the NCBI SRA Toolkit

download sra-files using the 'SRA Run Selector' or SRA Toolkit from https://www.ncbi.nlm.nih.gov/geo/query/XXXX
convert *.sra fiels to *.fastq.gz files usnig fastq-dump form SRA Toolkit

3. Step - run snakmake

snakemake -kn 
snakemake --create-envs-only   --use-conda
snakemake -k -p --use-conda -j 20

new Folder results

  .
  ├── data
  ├── references
  ├── report 
  ├── viper 	    				      # Github repository 
  │
  ├── logs # include loggings of the snakemake rules   
  ├── results # new folder for the results of the snakemake rules   
  │
  ├── Snakefile 	    				     
  ├── config.yaml 	    				     
  ├── units.tsv 	    				     
  └── samples.tsv

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
R		R
envs		envs
img		img
man		man
report		report
rules		rules
schemas		schemas
scripts		scripts
workflow		workflow
wrapper		wrapper
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
viper.Rproj		viper.Rproj

License

GrosseLab/VipeR_HIF1alpha

Folders and files

Latest commit

History

Repository files navigation

Viper workflow

Workflow graph

Setup the VIPER workflow

0. Step - clone the githup repository

1. Step - Set up the needed folder and copy files from viper/workflow/HIF_version_1.0

Folder and File Structure

2. Step - Download data -- will be soon available

3. Step - run snakmake

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

1. Step - Set up the needed folder and copy files from `viper/workflow/HIF_version_1.0`