You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can see that after_filtering there are 10'933'431 reads left in the cleaned FASTQ. However the filtering_result category tells us that as many as 18'724'357 passed the filter. This is a huge mismatch. What happened to the 8 or so million reads? Why did they get removed?
The text was updated successfully, but these errors were encountered:
I have the same issue here. It happened after I included flags to filter out the duplicated reads and low complexity reads. Without the two flags, the numbers seemed match each other
Read1 after filtering:
total reads: 9899483
total bases: 969140514
Q20 bases: 930274034(95.9896%)
Q30 bases: 849902013(87.6965%)
Read2 after filtering:
total reads: 9899483
total bases: 968730404
Q20 bases: 922809589(95.2597%)
Q30 bases: 846947592(87.4286%)
Filtering result:
reads passed filter: 19798966
reads failed due to low quality: 3232674
reads failed due to too many N: 206
reads failed due to too short: 111888936
reads with adapter trimmed: 58014749
bases trimmed due to adapters: 1885062968
I am seeing also a discrepancy in those results. I do have --dedup parameter when I run fastp, but if duplicates are being removed, maybe the final results should reflect that.
Here is the head of the file
stats_fastp.json
for a random single-end Illumina sequencing sample:after running it through fastp with the following command:
We can see that
after_filtering
there are 10'933'431 reads left in the cleaned FASTQ. However thefiltering_result
category tells us that as many as 18'724'357 passed the filter. This is a huge mismatch. What happened to the 8 or so million reads? Why did they get removed?The text was updated successfully, but these errors were encountered: