Skip to content
martinghunt edited this page Oct 21, 2015 · 17 revisions

Circlator: a tool to circularize genome assemblies

The input is a genome assembly in FASTA format and corrected PacBio or nanopore reads in FASTA or FASTQ format. Circlator will attempt to identify each circular sequence and output a linearised version of it. It does this by assembling all reads that map to contig ends and comparing the resulting contigs with the input assembly.

The input assembly must not be too fragmented. Although Circlator will join contigs together, whenever it can identify contigs that can be unambiguously joined, its main aim is to circularize the core genome and plasmids.

Any contigs that were identified as circular then have their start position changed. If a dnaA gene is found, then that is used as the starting position (or the user can provide a FASTA file of sequences to search for within the contigs). If no dnaA gene is found then prodigal is used to identify the gene nearest the centre of the contig, which is then used as the start position of the contig.

Installation

Please see the Circlator website for installation instructions.

Usage

For the impatient, read the brief instructions.

To understand the detailed log files to find out why contigs were or were not circularized, read the troubleshooting section.

The installation installs a single script called circlator, which can be used to run several tasks. Run circlator with no options to list all the available tasks. The tasks are:

  • all: this runs the complete Circlator pipeline. Specifically, it runs progcheck, mapreads, bam2reads, assemble, merge, clean, fixstart
  • mapreads: map reads to assembly
  • bam2reads: make reads from mapping to be reassembled
  • assemble: run assembly using reads from bam2reads
  • merge: merge original assembly and new assembly made by assemble
  • clean: remove small and completely contained contigs from assembly
  • fixstart: change start position of circular sequences
  • minimus2: run the minimus2-based circularisation pipeline
  • get_dnaa: download file of dnaA (or other of user's choice) genes
  • progcheck: checks dependencies are installed
  • version: print version and exit