Skip to content
David Chin edited this page Sep 16, 2016 · 10 revisions

About

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. It provides robust estimates of genome completeness and contamination by using collocated sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. Assessment of genome quality can also be examined using plots depicting key genomic characteristics (e.g., GC, coding density) which highlight sequences outside the expected distributions of a typical genome. CheckM also provides tools for identifying genome bins that are likely candidates for merging based on marker set compatibility, similarity in genomic characteristics, and proximity within a reference genome tree.

Our companion tool GroopM can be used to recover genomes from metagenomic data.

Citing

If you use CheckM in your research, please cite:

CheckM relies on several other software packages, we recommend also citing:

  • pplacer: Matsen FA, Kodner RB, Armbrust EV. 2010. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 11: doi:10.1186/1471-2105-11-538.
  • prodigal: Hyatt D, Locascio PF, Hauser LJ, Uberbacher EC. 2012. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28: 2223–2230.
  • HMMER: http://hmmer.org/

Contact Information

CheckM is in active development and we are interested in discussing all potential applications of this software. Inquiries can be sent to Donovan Parks (donovan.parks [at] gmail.com).

Information on submitting bug reports is available here.

Contributors

The following people have directly submitted enhancements to the CheckM code base or other substantial contributions: