Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reference database for the function AssignTaxonomy() #1925

Closed
madjus98 opened this issue Apr 10, 2024 · 6 comments
Closed

reference database for the function AssignTaxonomy() #1925

madjus98 opened this issue Apr 10, 2024 · 6 comments

Comments

@madjus98
Copy link

Dear Dr Callahan,
I'm working with DADA2 for food authentication and microbiome analysis and I am seeking guidance on finding a suitable FASTA database for the assignTaxonomy() function. While I understand NCBI no longer provides pre-built FASTA databases, my experience with format conversion is limited. Any recommendations for finding a good database would be appreciated. Moreover, also SILVA databases could result limited for my purposes and I was not able to find new and adeguate version on the SILVA website.

I really appreciate your help!

@benjjneb
Copy link
Owner

A set of pre-built taxonomic reference databases for assignTaxonomy() are available here: https://benjjneb.github.io/dada2/training.html

If those meet your needs, then great. There is also a section on that page about how to format custom databases should that become a necessity.

@madjus98
Copy link
Author

Thank you so much for your answer. Anyway I was wondering if converting those database available on NCBI or on SILVA to an assignTaxonomy() usable format was possible. Do you have any tutorial o guidlines for that?

Thank you so much

@benjjneb
Copy link
Owner

We have pre-formatted versions of Silva at the page linked above.

Do you have any tutorial or guidelines for that?

Nothing that rises to the level of a tutorial. The https://benjjneb.github.io/dada2/training.html#formatting-custom-databases section describes the required format for assignTaxonomy (which is pretty simple). There is also code in the dada2 R package to do this for Silva and RDP, but that code is more involved than will often be necessary because we do some QC on the underlying databases at the same time.

RDP code: https://github.com/benjjneb/dada2/blob/master/R/taxonomy.R#L382
Silva code: https://github.com/benjjneb/dada2/blob/master/R/taxonomy.R#L501

@madjus98
Copy link
Author

Thank you so much for your reply. I belived that those databases proposed here: https://benjjneb.github.io/dada2/training.html#formatting-custom-databases were only for training. Do you think that they could be reliable also for publishing paper results? I really don't know if they are only one of the complete Silva and RDP databses. If you think that they could be fine for publications my work is done and your suggestion was really helpful.

Finally, in case of the code that you send for producing them...its quite difficult and I really would like to learn how to menage. Can you have any advices, for example....which files I need to download from SILVA download to produce an updated versione available for DADA? thank you so much!

@benjjneb
Copy link
Owner

Do you think that they could be reliable also for publishing paper results?

Yes, the officially supported references are suitable for publishing paper results.

@madjus98
Copy link
Author

Okay thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants