Skip to content
lcoghill edited this page Oct 20, 2014 · 15 revisions

Description

Phyloboost is a python pipeline that allows the reconstruction, augmentation and visualizaton of the similarity-cluster-based tree sets constructed from the Phylota pipeline. These trees are unrooted trees built from datasets encompassing all of eukaryota from Genbank with more than 60,000 genera.

Purpose

  • Reconstruct the original cluster sets in FASTA format for easy use.
  • Augment / Expand the clusters with additional sequences through BLAST searches.
  • Filter the clusters removing taxonomically mislabled sequences.
  • Filter the clusters removing any 'known' tranposable elements.
  • Build alignments for all of the cluster sets.
  • Build unrooted trees for all of those alignments.
  • Attempt to root those trees via convex subtree graph methods developed by Rick Ree.
  • Compare and visualize those trees aligned to the NCBI taxonomy.

Installation

  • Install the software requirements
  • Install the needed databases
  • Prime the databases
  • Clone the Phyloboost Repo
  • Run the pipeline