GitHub - ianpgm/silva_to_dada2: Preparing Silva taxonomic databases for dada2

This is a Julia script to turn Silva databases in FASTA format into a format suitable for use in the assignTaxonomy function of dada2. The script truncates the taxonomies to a user-specified maximum number of levels, removes any trailing taxa that start with "uncultured", replaces spaces in taxon names with underscores, adds a trailing semicolon to the taxon string, replaces genus names "Escherichia-Shigella" with "Escherichia/Shigella", and removes the sequence ID. The Us in the Silva sequence are replaced with Ts.

Usage

First of all you will need to download and unzip the desired Silva database from the Silva FTP site. An example:

wget http://ftp.arb-silva.de/current/Exports/SILVA_132_SSURef_Nr99_tax_silva_trunc.fasta.gz
gunzip SILVA_132_SSURef_Nr99_tax_silva_trunc.fasta.gz

Provided you have Julia installed and in your PATH and you copy the script in this repository to the same directory as your downloads, you will be able to run the script like so:

julia silva_to_dada2.jl --input SILVA_132_SSURef_Nr99_tax_silva_trunc.fasta --levels 6 --output SILVA_132_SSURef_Nr99_tax_silva_trunc_dada2.fasta

Where --input specifies the database fasta file, --levels specifies the maximum number of taxonomic levels to truncate to (6 or 7 will usually be the right numbers here), and --output specified the desired name of the output file.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
silva_to_dada2.jl		silva_to_dada2.jl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

silva_to_dada2.jl

silva_to_dada2.jl

Repository files navigation

Usage

About

Releases

Packages

Languages

ianpgm/silva_to_dada2

Folders and files

Latest commit

History

README.md

README.md

silva_to_dada2.jl

silva_to_dada2.jl

Repository files navigation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages