Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add demultiplexing step #64

Open
DiegoBrambilla opened this issue Jan 18, 2019 · 12 comments
Open

Add demultiplexing step #64

DiegoBrambilla opened this issue Jan 18, 2019 · 12 comments
Labels
enhancement New feature or request

Comments

@DiegoBrambilla
Copy link
Contributor

Hi,
A very helpful feature to add would be the demultiplexing of the reads as an optional step. This function has already been developed on QIIME2 and, as such, it should be possible to add it to rrna-ampliseq pipeline.

@d4straub d4straub added the enhancement New feature or request label Jan 18, 2019
@d4straub d4straub self-assigned this Jan 18, 2019
@d4straub
Copy link
Collaborator

This might be a helpful feature. As far as I know there is work ongoing for wrapping DADA2 directely in this pipeline instead of QIIME2 using DADA2. Therefore I am unsure how to integrate this feature sustainably with the major changes that are planned to the early workflow. However, PRs are welcome.

@d4straub d4straub removed their assignment Jul 13, 2019
@d4straub
Copy link
Collaborator

@DiegoBrambilla is planning to implement dada2 for PacBio analysis and could immediately add that demultiplexing step :)

@DiegoBrambilla
Copy link
Contributor Author

We take it into consideration.
For the time being, implementing the R-DADA2 pipeline, taxonomy annotation from several sources and dealing with PacBio reads take priority.

@d4straub
Copy link
Collaborator

d4straub commented Jan 3, 2022

Demultiplexing could be done via cutadapt as documented here. I never come across the need for demultiplexing in the pipeline, but if anyone does, please mention it here and I might further look into it.

@a4000
Copy link
Contributor

a4000 commented Aug 2, 2023

I want to add demultiplexing (with Cutadapt) to Ampliseq. The way I've handled demultiplexing in my own nf-core style pipeline is to ask the user to specify the path to their raw data in the command line --raw_data "/path/to/data/*{R1,R2}*.fastq.gz". Then in the sample sheet the user has to add the columns fw_index, rv_index, fw_primer, and rv_primer (the two rv_ columns can be empty for single-end data). I use the _index columns for demultiplexing and the _primer columns for trimming after demultiplexing. The main issue I see is that Ampliseq doesn't require a sample sheet as input, so I'm wondering if anyone has a suggestion for a better way of adding this feature to Ampliseq? Maybe the sample sheet should be required if the user wants to demultiplex?

@d4straub
Copy link
Collaborator

d4straub commented Aug 2, 2023

What about adding a few optional columns (such as fw_index, rv_index) to the sample sheet. If those columns are present, demultiplexing will run. If that might mess too much with existing routines, a separate input file (e.g. --demultiplex "sheet.tsv") that contains the necessary information (samplesheet & demultiplexsheet have identical IDs) might be an option?
While ampliseq does not require a samplesheet (folder input & fasta input are also allowed), for demultiplexing that would be fine. After all, a samplesheet can handle more info than a folder input. Not all input options need to support all functionality, imho.

@erikrikarddaniel
Copy link
Member

To me, adding columns to the sample sheet sounds best.

@NoMeatNo
Copy link

Hi there,

I’m curious if it’s now possible to utilize AmpliSeq with the combinatorial dual indexing system and perform demultiplexing directly in the pipeline as part of the AmpliSeq workflow. Could someone please clarify? Thanks!

@a4000
Copy link
Contributor

a4000 commented Apr 29, 2024

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

@NoMeatNo
Copy link

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

Oh, I see. Thanks @a4000 for the quick response.

In the meantime, what’s the best strategy to follow? Would using Cutadapt and then Ampliseq be effective? How about q2-demux?

Earlier, you mentioned a method for demultiplexing in your own nf-core style pipeline, which involved specifying the path to raw data and using specific columns in the sample sheet. Could you provide more details on how you managed it? I’d appreciate any additional information you can share

@a4000
Copy link
Contributor

a4000 commented Apr 30, 2024

@NoMeatNo I haven't tried q2-demux, but I do recommend following Cutadapt's documentation on demultiplexing here. Using Cutadapt then Ampliseq should be effective.

@d4straub
Copy link
Collaborator

d4straub commented Apr 30, 2024

You could also check out https://nf-co.re/demultiplex (that I have never used) to apply first and then use ampliseq. If you do, let us know if that works as expected. Just dont do primer trimming or any quality filtering!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants