Skip to content

ablab/spades

Repository files navigation

About SPAdes

SPAdes is a versatile toolkit designed for assembly and analysis of sequencing data. SPAdes is primarily developed for Illumina sequencing data, but can be used for IonTorrent as well. Most of SPAdes pipelines support hybrid mode, i.e. allow using long reads (PacBio and Oxford Nanopore) as a supplementary data.

SPAdes package contains assembly pipelines for isolated and single-cell bacterial, as well as metagenomic and transcriptomic data. Additional modes allow to discover bacterial plasmids and RNA viruses, as well as perform HMM-guided assembly. Besides, SPAdes package includes supplementary tools for efficient k-mer counting and k-mer-based read filtering, assembly graph construction and simplification, sequence-to-graph alignment and metagenomic binning refinement.

Current version: see VERSION file.

Quick start

  • Complete user manual can be found here. Information below is provided merely for your convenience and cannot be considered as the user guide.

  • SPAdes assembler supports:

    • Assembly of second-generation sequencing data (Illumina or IonTorrent);
    • PacBio and Nanopore reads that are used as supplementary data only.
  • SPAdes allows to assemble genomes, metagenomes, transcriptomes, viral genomes etc.

  • Download SPAdes binaries for Linux or MacOS here. You can also compile SPAdes from source (requires g++ 9.0+, cmake 3.16+, zlib and libbz2). SPAdes requires only Python 3.8+ to be installed.

  • Test your SPAdes installation by running

    bin/spades.py --test
  • A single paired-end library (separate files, gzipped):
    bin/spades.py -1 left.fastq.gz -2 right.fastq.gz -o output_folder
  • IonTorrent data:
    bin/spades.py --iontorrent -s it_reads.fastq -o output_folder
  • A paired-end library coupled with long PacBio reads:
    bin/spades.py -1 left.fastq.gz -2 right.fastq.gz --pacbio pb.fastq -o output_folder
  • Available assembly modes: --isolate, --sc, --plasmid, --meta, --metaplasmid, --metaviral, --rna, --rnaviral, --bio, --corona, --sewage.

  • Standalone tools in SPAdes package: k-mer counting, k-mer cardinality estimation, k-mer-based read filtering, assembly graph construction, assembly graph simplification, alignment of long reads to an assembly graph, refinement of metagenomic binning.

Citation

If you use SPAdes in your research, please cite our latest paper.

In case you perform hybrid assembly using PacBio or Nanopore reads, you may also cite Antipov et al., 2015. If you use multiple paired-end and/or mate-pair libraries you may additionally cite papers describing SPAdes repeat resolution algorithms Prjibelski et al., 2014 and Vasilinetc et al., 2015.

If you use other pipelines, please cite the following papers:

You may also include older papers Nurk, Bankevich et al., 2013 or Bankevich, Nurk et al., 2012, especially if you assemble single-cell data.

Feedback and bug reports

Please, leave your comments and bug reports at our GitHub repository tracker. If you have any troubles running SPAdes, please attach params.txt and spades.log from the output folder.