rna_seq

Overview:

RNA-Seq pipelines that uses HISAT2 and Kallisto for alignment (pseudo-alignment and abundance calculations in case of the latter).

Download Data:

The dataset downloaded was GSE120534. Once the accession list is downloaded, download the SRA toolkit (https://github.com/ncbi/sra-tools) and run the following:

prefetch --option-file SRR_Acc_List.txt

Once the files are download, retrieve the FASTQ files the following way:

for file in *; do fastq-dump --split-files "$file"; done

Requirements:

Kallisto. Alternatively, conda install kallisto if conda is installed.
HISAT2.
Reference index:

Transcriptome index for Homo sapiens if Kallisto is used. Alternatively, index file can also be built using kallisto index.
Genome Index for Homo sapiens if HISAT2 is used. Alternatively, index file can also be built using hisat2-build.

Arguments:

-a | --aligner-to-use: Specify 1 if you want to use Kallisto or 2 if you want to use HISAT2. DEFAULT: Kallisto
-i | --input-files-directory: Enter the path of the directory containing FASTQ files.
-r | --reference-index: Enter the path of the reference index. 1. If Kallisto is selected, then enter the path of indexed reference transcriptome (Ends with .idx); 2. If HISAT2 is selected, then enter the path of indexed reference genome with the prefix
-o | --output-directory: Enter name of directory which will contain output of respective aligner.

Script execution:

1. Alignment / Pseudoalignment and quantify:

1. Kallisto:

Run ./aligner_wrapper.py -a 1 -i <FASTQ files directory> -r <reference index directory> -o <aligner output directory> to use Kallisto for pseudo-alginment and generate the abundance files.
Alternatively, the following bash script can be run in the directory where the input files are present:

for f in `ls *.fastq | sed 's/_[12].fastq//g' | sort -u`
do
kallisto quant -i ../reference/homo_sapiens/transcriptome.idx -o kallisto_output/${f} ${f}_1.fastq ${f}_2.fastq
done

2. HISAT2:

Run ./aligner_wrapper.py -a 2 -i <FASTQ files directory> -r <reference index directory>genome -o <aligner output directory> to use HISAT2 for alignment.
Alternatively, the following bash script can be run in the directory where the input files are present:

for f in `ls *.fastq | sed 's/_[12].fastq//g' | sort -u`
do
hisat2 -x ../reference/grch38/genome -1 ${f}_1.fastq -2 ${f}_2.fastq -S ${f}.sam
done

2. Differential Expression (DE):

1. DESeq2:

Run ./differential_expression.R -de DESeq -o <kallisto_output_directory> -m <meta_data_file> -p <pvalue_cutoff> to import transcript abundance files from Kallisto and perform DE analysis with DESeq2.

2. Sleuth:

Run ./differential_expression.R -de Sleuth -o <kallisto_output_directory> -m <meta_data_file> -p <pvalue_cutoff> to use Sleuth after quantifying with Kallisto.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md
aligner_wrapper.py		aligner_wrapper.py
differential_expression.R		differential_expression.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

aligner_wrapper.py

aligner_wrapper.py

differential_expression.R

differential_expression.R

Repository files navigation

rna_seq

Overview:

Download Data:

Requirements:

Arguments:

Script execution:

1. Alignment / Pseudoalignment and quantify:

1. Kallisto:

2. HISAT2:

2. Differential Expression (DE):

1. DESeq2:

2. Sleuth:

About

Releases

Packages

Languages

ahishsujay/rna_seq

Folders and files

Latest commit

History

Repository files navigation

rna_seq

Overview:

Download Data:

Requirements:

Arguments:

Script execution:

1. Alignment / Pseudoalignment and quantify:

1. Kallisto:

2. HISAT2:

2. Differential Expression (DE):

1. DESeq2:

2. Sleuth:

About

Topics

Resources

Stars

Watchers

Forks

Languages