Skip to content

peterk87/nf-villumina

Repository files navigation

peterk87/nf-villumina

Generic viral Illumina sequence analysis pipeline

Build Status Nextflow

install with bioconda Docker https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg

Introduction

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with a Singularity container making installation trivial and results highly reproducible.

nf-villumina will

  • remove low quality reads (fastp)
  • filter for reads from a taxonomic group of interest (by default superkingdom Viruses (taxid=10239)) using Kraken2 and Centrifuge classification results
  • perform de novo assembly with [Unicycler] and [Shovill] on the taxonomic classification filtered reads
  • search all contig sequences using NCBI nucleotide BLAST against a database of your choice (we recommend the version 5 NCBI nt DB)

NOTE: You will need to create/download databases for Kraken2, Centrifuge and BLAST in order to get the most out of this workflow!

Pre-requisites

Taxonomic Classification for Kraken2 and Centrifuge

For taxonomic classification with Kraken2 and Centrifuge, you will need to download (or build) databases for these programs so that you may use them within the nf-villumina workflow.

You can point to the Kraken2 and Centrifuge database with export KRAKEN2_DB=/path/to/kraken2/database and export CENTRIFUGE_DB=/path/to/centrifuge/database/prefix in your ~/.bashrc so you don't need to specify it each time you run the workflow with --kraken2_db /path/to/kraken2/standard2 --centrifuge_db /path/to/centrifuge/nt-2018-03-03/nt

Kraken2 DBs

Centrifuge DBs

BLAST DBs

For nf-villumina, you must have a version 5 BLAST DB with embedded taxonomic information installed, e.g. version 5 nt DB (see https://ftp.ncbi.nlm.nih.gov/blast/db/v5/)

You can download pre-built BLAST DBs like nt and nr from the NCBI FTP site using the update_blastdb.pl script included with your install of BLAST+ to download and/or update your local BLAST databases.

Show all available databases:

$ update_blastdb.pl --showall

Download the BLASTDB version 5 "nt" database to your current directory decompressing files and deleting original compressed archives:

update_blastdb.pl --blastdb_version 5 nt --decompress

NOTE: For ease of use, all databases should be downloaded to the same directory (e.g. /opt/DB/blast set in $BLASTDB environment variable in your ~/.bashrc)

Check that your database has been downloaded properly and has taxids associated with the sequences contained within it:

$ blastdbcheck -db nt -must_have_taxids -verbosity 3

Documentation

The peterk87/nf-villumina pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting

Credits

peterk87/nf-villumina was originally written by Peter Kruczkiewicz.

Bootstrapped with nf-core/tools nf-core create.

Thank you to the nf-core/tools team for a great tool for bootstrapping creation of a production ready Nextflow workflows.