Skip to content

Commit

Permalink
Merge pull request #110 from rrohwer/master
Browse files Browse the repository at this point in the history
updates to clarify directions and add script to reformat dada2 output
  • Loading branch information
rrohwer committed Jun 11, 2019
2 parents bd08f9d + 6eea40e commit 35e8693
Show file tree
Hide file tree
Showing 6 changed files with 1,977 additions and 467 deletions.
14 changes: 10 additions & 4 deletions FreshTrain-files/README.md
Expand Up @@ -14,10 +14,16 @@ zipped file name | description
FreshTrain18Aug2016 | old version formatted for Greengenes (don't use)
FreshTrain25Jan2018Greengenes13_5.zip | current version formatted for Greengenes
FreshTrain30Apr2018SILVAv128.zip | current version formatted for SILVA v128
FreshTrain30Apr2018SILVAv132.zip | current version formatted for SILVA v132
**FreshTrain30Apr2018SILVAv132.zip** | **current version formatted for SILVA v132**

The different formats match the FreshTrain's coarse-level nomenclature to the nomenclature in the comprehensive database of choice. The FreshTrain defines lineage-clade-tribe (~family-genus-species) level phylogenies, so the phylum, class, and order names are changed in the different versions to be consistent with the chosen comprehensive database.
The different formats match the FreshTrain's coarse-level nomenclature to the nomenclature in the comprehensive database of choice. The FreshTrain defines lineage-clade-tribe (~family-genus-species) level phylogenies, so the phylum, class, and order names are changed in the different FreshTrain versions to be consistent with the paired comprehensive database.

<br>
The citation for the FreshTrain database is:
[Newton, R. J., Jones, S. E., Eiler, A., McMahon, K. D. & Bertilsson, S. A guide to the natural history of freshwater lake bacteria. Microbiol. Mol. Biol. Rev. 75, 14–49 (2011).](http://mmbr.asm.org/content/75/1/14.full)
The citation for the original FreshTrain database and the arb version of it is:

[Newton RJ, Jones SE, Eiler A, McMahon KD, Bertilsson S. 2011. A Guide to the Natural History of Freshwater Lake Bacteria. Microbiol Mol Biol Rev 75:14–49.](https://mmbr.asm.org/content/75/1/14.full) The arb files are available at [github.com/McMahonLab/FWMFG](https://github.com/McMahonLab/FWMFG).

<br>
The citation for these taxonomy assignment-compatible formats of the FreshTrain and the TaxAss method is:

[Rohwer RR, Hamilton JJ, Newton RJ, McMahon KD. 2018. TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution. mSphere 3:e00327-18.](https://msphere.asm.org/content/3/5/e00327-18)
7 changes: 2 additions & 5 deletions README.md
Expand Up @@ -7,12 +7,9 @@ How do I TaxAss?

**Step-by-step directions:** [tax-scripts/TaxAss_Directions.html](https://htmlpreview.github.io/?https://github.com/McMahonLab/TaxAss/blob/master/tax-scripts/TaxAss_Directions.html)

Please cite our mSphere paper:
TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution
Robin R Rohwer, Joshua J Hamilton, Ryan J Newton, Katherine D McMahon
mSphere; doi: https://doi.org/10.1128/mSphere.00327-18
**Please cite TaxAss:** [Rohwer RR, Hamilton JJ, Newton RJ, McMahon KD. 2018. TaxAss: Leveraging a Custom Freshwater Database Achieves Fine-Scale Taxonomic Resolution. mSphere 3:e00327-18.](https://msphere.asm.org/content/3/5/e00327-18)

TaxAss uses a series of R, python, and bash scripts in addition to using BLAST+ and mothur's classify.seqs() command. The scripts are sourced from the terminal window (mac or linux). You'll need to download this repository (green "Clone or download" button, top right), and then just add the tax-scripts folder to your working diriectory.
TaxAss only assigns taxonomy, so you can use TaxAss after using mothur, dada2, vsearch, or whatever QC pipeline you prefer. TaxAss uses a series of R, python, and bash scripts in addition to using BLAST+ and mothur's classify.seqs() command. The scripts are sourced from the terminal window (mac or linux). You'll need to download this repository (green "Clone or download" button, top right), and add the tax-scripts folder to your working diriectory.

Where's the stuff I need?
---
Expand Down
6 changes: 4 additions & 2 deletions tax-scripts/RunSteps_quickie.sh
Expand Up @@ -4,17 +4,19 @@
# That means that you do not try different percent identity cutoffs to choose the best one.
# That might make sense for you if you have already made a similarity choice, for example by
# choosing a cutoff to cluster OTUs. Then just have pident match that cutoff.
# In almost all of our test datasets we found a pident of 98 was best.
# Note: this also skips the BLAST check (step 6). You could go back and just do that one.
# Note: still run step 16 to tidy up.
# Note: still gotta do the reformatting manually (step 0)

# Choose pident.
# USER CAN CHANGE THIS INPUT ---------------------------------

pident=("98")
fwbootstrap=("80")
ggbootstrap=("80")
processors=("2")

# Note: still gotta do the reformatting manually (step 0)
# -------------------------------------------------------------

# 1
makeblastdb -dbtype nucl -in custom.fasta -input_type fasta -parse_seqids -out custom.db &&
Expand Down

0 comments on commit 35e8693

Please sign in to comment.