Skip to content

banskt/trans-eqtl-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparison of different methods in trans-eQTL

(Currently in development)

This pipeline measures the performace of different methods in finding trans-eQTLs on the real data, as well as their performance on null data (i.e. genotype with shuffled donors). The following methods will be included:

We also want to compare:

  • effect of different pre-filtering methods
  • kNN
  • effect of sparsity in TEJAAS

And, finally we plot everything together:

  • Plot

Method

We use the gene expression of two different tissues within the same population. We find trans-eQTLs using different methods, and then compare the methods using precision and recall, assuming that the tissue-consistent trans-eQTLs (those which are found in both tissues) are true positives while everything else is false positive.

Input

The pipeline expects the following input files:

  • Genotype (in gzipped dosage format)
  • Expression (tab-separated text file: genes in rows, samples in columns. Header row with sample-ids, first column with gene names. In the header row, the first column is named gene_id).
  • Sample (a dummy sample file in Oxford format)
  • GENCODE file
  • gene position file (for MatrixEQTL)
  • MAF file from 1000Genomes

Required softwares for the pipeline

  • Python >3.6 (numpy, mpmath)
  • TEJAAS
  • LDSTORE
  • GNeTLMM
  • R v3.4.1 (MatrixEQTL)

Required softwares for pre-processing

  • Python >3.6
  • VCFtools (v0.1.15)
  • htslib (v1.4.1 -- for tabix and bgzip)

How to run

  1. Within bsubfiles folder, change the job submission criteria and module loadings as per your requirements (GWDG users, skip this step)
  2. Modify main/utils/submit_job to your own job scheduling mechanism (bsub users, skip this step)
  3. Update the path of external programs main/EXTERNAL
  4. Update the path of your datasets in main/DATA.
  5. Create a CONFIG file (see example in configs/CONFIG).
  6. Run the pipeline from within main directory.
cd main
./01_validation_pipeline.sh configs/CONFIG
./02_process_chunks.sh configs/CONFIG

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published