New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loosing reads after merging #1928
Comments
What is are the read lengths you are generating? 2x250? And how were reads pre-processed prior to dada2? |
Hey there! Thanks for the quick response. Yes sequencing generated lenghts of 250bp paired-end raw reads and they were pre-processed by removing barcodes and primers. |
Assuming that you are using the "Illumina" v3v4 protocol, the sequenced amplicons are ~440-460nts long (there is a bimodal length distro in the V3V4 region), and include the primers at the start of the reads, but do not have barcodes on the R1/R2 reads. Since you are using 2x250, that is only 500nts of total sequencing for each read, thus a very short overlap region. What you are seeing in your data is probably the loss of the longer V3V4 mode from a failuer to merge due to insufficient overlap. To simplify things, I would suggest dropping your pre-processing step, and relyinng on |
This did not work unfortunately...
G12CB2.1 203435 195575 179002 183177 99666 75499 |
I would check two things then: (1) What exactly is your amplicon design? i.e. what primer set, are primers on the start of the reads, and is there any other additional technical bases (e.g. heterogeneity spacers, barcodes) at the start of the reads. (2) Could there be substantial off-target amplification? You can check this straightforwardly by processing just forward reads for a sample or two. What is showing up there that is not in the merged reads? |
Hello!
I want to use dada2 for creating an ASV table of my 16S V3-V4 sequencing data. Sequencing was performed with Illumina PE250.
The quality profiles of my sequences look (suspiciously) good so I truncated my reads only by a few nts at the ends.
However during merging I always loose about half of my reads. And by always I mean: I tried truncating shorter, longer, I adjusted the ee, I adjusted the maxmismatch and minoverlap. I tried also not truncating at all. Still always about half of my reads are lost at merging.
input filtered denoisedF denoisedR merged nonchim
G12CB3.1 202420 193452 191447 191555 119718 114827
G12CP2.2 204188 194160 193195 193019 80558 77627
G12CP4.3 215010 203014 202915 202909 80327 80325
G12NP4.1 206177 196313 195784 195741 104359 101796
G12NP4.3 202944 193534 193214 193114 93879 91266
G15CB3.1 204004 195434 195178 195167 157006 156426
I have the suspicion that in general my reads dont have a high overlap because I looked at some as an example and my overlap was between 8 and 40nts ...
Can someone maybe offer some assistance? What can I do?
QualityScoreForward.pdf
QualityScoreReverse.pdf
The text was updated successfully, but these errors were encountered: