Skip to content

Alignment Based Analysis

Sam Minot edited this page Jan 17, 2020 · 1 revision

After performing de novo assembly individually for each specimen, identifying protein-coding genes, and deduplicating those genes across all samples to form a non-redundant gene catalog, geneshot will quantify the abundance of every gene in every specimen using an alignment-based approach.

In the first step, every read will be aligned against the gene catalog with DIAMOND. In the second step, any read that aligns to multiple genes will be resolved into a single unique alignment with the FAMLI algorithm.

Options available to the user are:

  • --dmnd_min_identity: Amino acid identity cutoff used to align short reads, default: 90
  • --dmnd_min_coverage: Query coverage cutoff used to align short reads, default: 50
  • --dmnd_top_pct: Keep top X% of alignments for each short read, default: 1
  • --dmnd_min_score: Minimum score for short read alignment, default: 20
  • --gencode: Genetic code used for conceptual translation, default: 11