Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sintax support for all available databases #618

Open
jtangrot opened this issue Aug 17, 2023 · 1 comment
Open

Add sintax support for all available databases #618

jtangrot opened this issue Aug 17, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@jtangrot
Copy link
Contributor

Description of feature

I suggest to make it possible to run sintax instead of assignTaxonomy using all taxonomic databases currently supported by ampliseq. It should only be a matter of reformatting the headers in the fasta files, according to the description in the vsearch manual:
The reference database must contain taxonomic information in the header of each sequence in the form of a string starting with ";tax=" and followed by a comma-separated list of up to eight taxonomic identifiers. Each taxonomic identifier must start with an indication of the rank by one of the letters d (for domain), k (kingdom), p (phylum), c (class), o (order), f (family), g (genus), or s (species). The letter is followed by a colon (:) and the name of that rank. Commas and semicolons are not allowed in the name of the rank. Example: ">X80725_S000004313;tax=d:Bacteria,p:Proteobacteria,c:Gammaproteobacteria, o:Enterobacteriales,f:Enterobacteriaceae,g:Escherichia/Shigella,s:Escherichia_coli".

@jtangrot jtangrot added the enhancement New feature or request label Aug 17, 2023
@erikrikarddaniel
Copy link
Member

Why not?! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants