-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #109 from rrohwer/master
Minor Edits to Reflect mSphere publication
- Loading branch information
Showing
29 changed files
with
153 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,5 @@ | ||
These are scripts for reformatting the ARB output of the Freshwater Training Set | ||
# README arb-scripts | ||
|
||
Robin's in the process of pulling them together into a repeatable workflow. | ||
That's gonna be in this repo soon. | ||
|
||
The .pl script is the one Trina uses already. | ||
The ones Robin's adding are to address bugs she's encountered while working on 16STaxAss. | ||
|
||
When Robin is not writing code, she enjoys referring to herself in the 3rd person. | ||
|
||
RRR 2-8-16 | ||
These are scripts for reformatting the ARB output of the Freshwater Training Set. | ||
|
||
This folder also includes the scripts and files used for the TaxAss manuscript's Marathonas Validation (fig 1 in paper). These are located inside the `Marathonas_test` folder. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# README figure-scripts | ||
|
||
These scripts generate the final versions of figures and tables for the paper. | ||
Many of the figures and tables are already generated in the workflow, but this is | ||
the version of the script where I tweak colors, dimensions, resolution, etc. | ||
For tables, a csv with the data is generated that msphere later formatted pretty. | ||
|
||
Also included are figures generated for my ISME16 poster, figures that didn't make | ||
the final cut, and a script for finding all the stats listed in the text. | ||
|
||
The folder structure for reproducing the manuscript's data is `TaxAss-BatchFiles.zip`. | ||
This folder pairs with the directions in `README-TaxAss-BatchFiles.html`. | ||
|
||
The Marathonas validation batch files and directions are inside the `arb-scripts` folder. | ||
The directions are in the file `arb-scripts/Marathonas_test/README_marathonas_validation.html` | ||
and all necessary files are also in that folder. |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
Scripts for 16S Taxonomy Assignment Workflow | ||
README tax-scripts | ||
=== | ||
|
||
This folder contains the scripts used for OTU assignment. Detailed descriptions of them are in the `Clean Workflow.txt` document. | ||
This folder contains the TaxAss scripts. Detailed descriptions of them are in `TaxAss_Directions.html`. You can read the directions online here: [tax-scripts/TaxAss_Directions.html](https://htmlpreview.github.io/?https://github.com/McMahonLab/TaxAss/blob/master/tax-scripts/TaxAss_Directions.html) | ||
|
||
To use TaxAss, download this folder, add your data and databases, and use it as your working directory. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# RRR 2018-7-20 | ||
|
||
# dada2 has an internal implementation of the Wang classifier/ RDP classifier / mothur default / TaxAss choice | ||
# BUT, just keep using mothur with TaxAss because formatting is painful. | ||
# this script takes the dada2 output file (seqtab_nochim) and cretes otus.fasta and otus.abund | ||
|
||
# command line syntax: | ||
|
||
# Rscript reformat.dada2.seqtabs.R seqtab_nochim.rds otus.fasta otus.abund | ||
|
||
# ---- Accept Arguments from Terminal Command Line ---- | ||
|
||
userprefs <- commandArgs(trailingOnly = TRUE) | ||
path.to.seqtab <- userprefs[1] | ||
fasta.output <- userprefs[2] | ||
abund.output <- userprefs[3] | ||
|
||
cat("fuck you forgot to comment out the file paths in reformat_dada2_seqtabs.R!") | ||
path.to.seqtab <- "/Users/athena/Desktop/dada2-meV45/dada2/seqtab_nochim.rds" | ||
fasta.output <- "~/Desktop/otus.fasta" | ||
abund.output <- "~/Desktop/otus.abund" | ||
count.output <- "/Users/athena/Desktop/dada2-meV45/taxass_gg/data/otus.count" | ||
|
||
# ---- define functions ---- | ||
|
||
import.dada2.file <- function(path){ | ||
is.rds <- grepl(pattern = "*.rds", x = path) | ||
if (is.rds){ | ||
seqtab_nochim <- readRDS(file = path) | ||
return(seqtab_nochim) | ||
}else{ | ||
cat("input must be rds file ending in \".rds\"\nSave it in this format at the end of dada2 pipeline using\nsaveRDS(object = seqtab_nochim, file = \"filename\")") | ||
} | ||
} | ||
|
||
make.fasta.file <- function(dadatable, fasta.path){ | ||
fasta.seqs <- colnames(dadatable) | ||
otu.names <- paste("otu", 1:length(fasta.seqs), sep = "_") | ||
fasta.names <- paste(">", otu.names, sep = "") | ||
fasta.file <- paste(fasta.names, fasta.seqs, sep = "\n") | ||
write.table(x = fasta.file, file = fasta.path, quote = F, row.names = F, sep = "\n", col.names = F) | ||
cat("Made file: ", fasta.path, "\n") | ||
return(otu.names) | ||
} | ||
|
||
find.zero.samples <- function(tot.reads, otu.table){ | ||
read.stats <- boxplot(x = tot.reads, plot = F) | ||
cat("These samples have outlier read counts. Samples with zero reads are being removed.\n",paste(names(read.stats$out), " : ", read.stats$out, sep = "", "\n")) | ||
index <- which(tot.reads == 0) | ||
return(index) | ||
} | ||
|
||
convert.to.rel.abund <- function(OTUs){ | ||
sample.totals <- rowSums(OTUs) | ||
# vectors are applied to matrices by stepping down rows in a column | ||
norm.otus <- OTUs / sample.totals * 100 | ||
return(norm.otus) | ||
} | ||
|
||
reformat.for.taxass <- function(OTUs){ | ||
# rows = OTUs, cols = samples, col 1 = seqIDs | ||
otu.table <- t(OTUs) | ||
otu.table <- cbind(row.names(otu.table), otu.table) | ||
colnames(otu.table)[1] <- "seqID" | ||
return(otu.table) | ||
} | ||
|
||
make.abund.file <- function(OTUs, FilePath){ | ||
write.table(x = OTUs, file = FilePath, quote = FALSE, sep = "\t", row.names = FALSE) | ||
cat("Made file: ", FilePath, "\n") | ||
} | ||
|
||
make.count.file <- function(Count, FilePath){ | ||
Count <- cbind(names(Count), Count) | ||
colnames(Count) <- c("Sample.Name", "Total.Reads") | ||
write.table(x = Count, file = FilePath, quote = F, sep = "\t", row.names = F) | ||
cat("Made file: ", FilePath, "\n") | ||
} | ||
|
||
# ---- use functions ---- | ||
|
||
seqtab_nochim <- import.dada2.file(path = path.to.seqtab) | ||
|
||
seqIDs <- make.fasta.file(dadatable = seqtab_nochim, fasta.path = fasta.output) | ||
|
||
colnames(seqtab_nochim) <- seqIDs | ||
|
||
read.count <- rowSums(seqtab_nochim) | ||
|
||
index <- find.zero.samples(tot.reads = read.count, otu.table = seqtab_nochim) | ||
if(length(index) > 0){ | ||
read.count <- read.count[-index] | ||
seqtab_nochim <- seqtab_nochim[-index, ] | ||
} | ||
|
||
seqtab_nochim <- convert.to.rel.abund(OTUs = seqtab_nochim) | ||
|
||
seqtab_nochim <- reformat.for.taxass(OTUs = seqtab_nochim) | ||
|
||
make.abund.file(OTUs = seqtab_nochim, FilePath = abund.output) | ||
|
||
make.count.file(Count = read.count, FilePath = count.output) | ||
|
||
|
||
|
||
|
Empty file.