Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 1.47 KB

comparative_genomics.md

File metadata and controls

21 lines (15 loc) · 1.47 KB

Comparative (meta)genomics pipelines

This is a collection of tools that are useful for comparing metagenomic datasets to each other and to references. They were also used in the Transmission of crAssphage paper (reference once it's available).

  1. Compare many metagenomic datasets (reads) to a single reference for SNP calling, multiallelic site identification
  2. Pairwise comparison of metagenomic assemblies of an microbe
  3. Compare a single metagenomic sample to a collection of references to identify what strain it's closest to

SNP calling on many metagenomic datasets

comparative_metagenomics/many_vs_one_snippy.snakefile

This takes as input a set of sequencing reads and a reference genome. SNPs are called against the reference using snippy. Variants are filtered for high quality. To compare many samples against each other, variants are normalized and decomposed using vt.

A heatmap of pairwise SNP similarity between samples is generated at the end.

Pairwise comparison of metagenomic assemblies of a single microbe

comparative_metagenomics/compare_assembled_contigs.snakefile

This pipeline takes as input a set of metagenomic assemblies, filters for contigs >500bp in length, and aligns contigs to a reference genome. Contigs that align are then compared against each other, pairwise, using nucmer.