Skip to content

TRON-Bioinformatics/tronflow-hla-hd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TRON logo


TronFlow HLA typing pipeline

License Powered by Nextflow

Nextflow (Di Tommaso, 2017) pipeline for HLA typing using HLA-HD (Kawaguchi, 2017).

How to run it

Prepare an input table with the FASTQs for each sample with three tab-separated columns without a header using --input_fastqs.

Sample name FASTQ1 FASTQ2
sample_1 /path/to/sample_1.1.fq.gz /path/to/sample_1.2.fq.gz
sample_2 /path/to/sample_2.1.fq.gz /path/to/sample_2.2.fq.gz

Alternatively, provide a table with BAM files using --input_bams.

Sample name BAM
sample_1 /path/to/sample_1.bam
sample_2 /path/to/sample_2.bam

BAM files should be indexed.

Run as indicated below.

$ nextflow run tron-bioinformatics/tronflow-hla-hd --help

N E X T F L O W  ~  version 19.07.0
Launching `main.nf` [intergalactic_shannon] - revision: e707c77d7b

Usage:
    nextflow run main.nf --input_files input_files --output output_folder

Input:
    * input_fastqs: the path to a tab-separated values file containing in each row the sample name, FASTQ 1 and FASTQ 2
    The input file does not have header!
    Example input file:
    name1       fastq1.fq.gz    fastq2.fq.gz
    name2       fastq1.fq.gz    fastq2.fq.gz
    * input_bams: the path to a tab-separated values file containing in each row the sample name and BAM
    The input file does not have header!
    Example input file:
    name1       name1.bam
    name2       name2.bam
    * output: output folder where results will be stored

Optional input:
    * reference: the reference genome to use (default: hg38, possible values: hg38 or hg19)
    * read_length: the read length (default: 50)
    * hlahd_folder: the HLA-HD folder (default: /code/hlahd.1.2.0.1)
    * bowtie2_folder: the bowtie2 folder (default: /code/bowtie/2.3.4.3)
    * bowtie2_module: the module to load with bowtie2
    * ld_library_path: the value to set in LD_LIBRARY_PATH
    * cpus: the number of CPUs per sample (default: 15)
    * memory: the amount of memory per sample (default: 30g)

References

  • Kawaguchi S, Higasa K, Shimizu M, Yamada R, Matsuda F. HLA-HD: An accurate HLA typing algorithm for next-generation sequencing data. Hum Mutat. 2017 Jul;38(7):788-797. doi: 10.1002/humu.23230 Add to Citavi project by DOI. Epub 2017 May 12. PMID: 28419628
  • Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316–319. 10.1038/nbt.3820