Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A feature to remove taxa found in controls #634

Open
skose82 opened this issue Sep 5, 2023 · 5 comments
Open

A feature to remove taxa found in controls #634

skose82 opened this issue Sep 5, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@skose82
Copy link

skose82 commented Sep 5, 2023

Description of feature

Hi there,

As per discussion with Daniel Straub, I'd like to request a contamination removal feature in which taxa found in controls are removed from the main sample set. It would be great if the feature was optional, as sometimes the water etc controls contain cross contamination from the sample set rather than the environment itself.

@skose82 skose82 added the enhancement New feature or request label Sep 5, 2023
@d4straub
Copy link
Collaborator

d4straub commented Sep 6, 2023

Thanks!
The idea here could be to add a parameter, e.g. --contamination_controls "sample1,sample2", and all sequences that appear in that control samples are removed from the ASV table (including the control samples itself).
More advanced for such a task (using control samples) might be decontam which is also in bioconda.

@erikrikarddaniel
Copy link
Member

I would absolutely recommend Decontam. We have seen in actual projects that raw removal of ASVs found in negative controls risks both to remove true ASVs found in samples and miss contaminants. This is, of course, taking Decontam as the truth, but the results have looked intuitively good.

There are at least two ways of running Decontam, and I think it would be wise to allow both.

@d4straub
Copy link
Collaborator

Alright, thanks, then it will be not worth the effort to implement the simple method above but rather immediately a proper one such as Decontam.

@skose82
Copy link
Author

skose82 commented Sep 12, 2023

Hi all,

I wouldn't advise decontam until everything is known about how it removes an asv - exactly. We still need a clean feature which will simply remove anything in the control samples as a first pass for comparison with a second pass without removal. This is what we did before ampliseq and what most microbiologists do with every project - scan the controls and remove what they see as a legitimate contamination. To do this removal is time intensive and tedious and then you have to replot. It would be truly worthwhile to have this feature as an option, then we can look at the output and decide if it's worth using decontam instead or not. It certainly should be an option as it currently is not an option in decontam!

@d4straub
Copy link
Collaborator

Hm thats a rather emotional plea for a simple method. I do think that the decontam documentation is not too ambiguous. Decontam implements a method that is using control samples, see here, I am not sure what your exact criticism is?
Manual manipulation is however the worst way of data processing in my opinion, it would be in any case better to automatize, i.e. standardize and make reproducible. I can live with having optional filters available. If you or someone else wants to implements that simple method because you feel its a method with future, I will not stand in your way (I cannot speak for others though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants