Fecal samples not the same results and error msg when learning errors #1927

FlorianRocher · 2024-04-12T14:35:39Z

Hi,

I'm trying to apply the pipeline described here https://benjjneb.github.io/LRASManuscript/LRASms_Zymo.html using 1.31 version of dada2. But I'm not getting the same number of filtered reads. Also I get the following error message when I use the learnErrors function: The max qual score of 93 was not detected. Using standard error fitting.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.

Here are the results I get from the lines before:

prim <- removePrimers(MockZymo, nop, primer.fwd=F27, primer.rev=dada2:::rc(R1492), orient=TRUE, verbose=TRUE)
Multiple matches to the primer(s) in some sequences. Using the longest possible match.
39439 sequences out of 77453 are being reverse-complemented.
Overwriting file:C:\Users\fner0001\Documents-Local\PostDoc_Florian\Lab_Project_Data\Metabarcoding_Tutorials\PacbioFecal\Mock\noprimers\Zymo.fastq.gz
Read in 77453, output 73057 (94.3%) filtered sequences.

filt <- file.path(MockDir, "noprimers", "filt", basename(MockZymo))
track <- filterAndTrim(nop, filt, minQ=3, minLen=1000, maxLen=1600, maxN=0, rm.phix=FALSE, maxEE=2, verbose=TRUE)
Overwriting file:C:\Users\fner0001\Documents-Local\PostDoc_Florian\Lab_Project_Data\Metabarcoding_Tutorials\PacbioFecal\Mock\noprimers\filt\Zymo.fastq.gz
Read in 73057, output 72940 (99.8%) filtered sequences.
track <- fastqFilter(nop, filt, minQ=3, minLen=1000, maxLen=1600, maxN=0, rm.phix=FALSE, maxEE=2, verbose=TRUE)
Overwriting file:C:/Users/fner0001/Documents-Local/PostDoc_Florian/Lab_Project_Data/Metabarcoding_Tutorials/PacbioFecal/Mock/noprimers/filt/Zymo.fastq.gz
Read in 73057, output 72940 (99.8%) filtered sequences.

dereplicate

drp <- derepFastq(filt, verbose=TRUE)
Dereplicating sequence entries in Fastq file: C:/Users/fner0001/Documents-Local/PostDoc_Florian/Lab_Project_Data/Metabarcoding_Tutorials/PacbioFecal/Mock/noprimers/filt/Zymo.fastq.gz
Encountered 22309 unique sequences from 72940 total sequences read.

Best

Florian Rocher

benjjneb · 2024-04-12T14:56:17Z

Did you download the data using sra toolkit? Or via web links?

You need to use the sra toolkit to get the valid data. The web links give "SRAlite" data format which is not the original data -- it replaces all the real quality scores with Q30.

See here for more info: benjjneb/LRASManuscript#7

FlorianRocher · 2024-04-12T14:59:07Z

Thanks a lot for your answer. Indeed I downloaded the samples directly from ncbi web page. I'll use sratoolkit.

Best

Florian

FlorianRocher changed the title ~~Fecal samples not the ame results and error msg when learning errors~~ Fecal samples not the same results and error msg when learning errors Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fecal samples not the same results and error msg when learning errors #1927

Fecal samples not the same results and error msg when learning errors #1927

FlorianRocher commented Apr 12, 2024

dereplicate

benjjneb commented Apr 12, 2024

FlorianRocher commented Apr 12, 2024

Fecal samples not the same results and error msg when learning errors #1927

Fecal samples not the same results and error msg when learning errors #1927

Comments

FlorianRocher commented Apr 12, 2024

dereplicate

benjjneb commented Apr 12, 2024

FlorianRocher commented Apr 12, 2024