Skip to content

De novo Assembly

Sam Minot edited this page Feb 7, 2020 · 2 revisions

To identify microbial genes, geneshot will:

  • perform de novo assembly of short reads with MEGAHIT,
  • identify protein-coding sequences with Prodigal, and
  • deduplicate similar gene sequences using MMseqs2

Various flags include:

  • --phred_offset: The PHRED offset used by MEGAHIT, default: 33
  • --min_identity: Amino acid identity cutoff used by MMseqs2 to combine similar genes, default: 90
  • --min_coverage: Length cutoff used by MMseqs2 to combine similar genes, default: 50