Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastp not removing all Illumina universal adapter sequences as indicated by FastQC #558

Open
luckyvivi opened this issue Apr 15, 2024 · 2 comments

Comments

@luckyvivi
Copy link

Hi, I recently ran fastp on an Illumina dataset with the following command:
fastp -i SRR18278237.fastq.gz -o SRR18278237.fastp.gz -z 9 -l 15 -w 16 --dedup --dup_calc_accuracy 6 -x -3 --cut_mean_quality 20 -j SRR18278237.fastp.json -h SRR18278237.fastp.html

I expected that this command would remove the Illumina universal adapter sequences from the reads. However, after running FastQC on the output files, I'm still seeing a significant adapter content in the FastQC report, specifically towards the end of the reads (please see attached screenshot).
image

Could you please help me understand the following:

  1. Is there a possibility that fastp might not remove some of the adapter sequences under certain conditions?
  2. Do I need to specify the adapter sequences explicitly using the -a option, even though these are standard Illumina universal adapters?
  3. Is there anything in my fastp command that might have prevented the adapter sequences from being adequately detected and trimmed?

I have attached the JSON and HTML reports from fastp for your reference. I would greatly appreciate any insights or suggestions you might have to resolve this issue.

Thank you for your assistance and for developing such a useful tool.

Best regards,
Xiaowen
Uploading SRR18278237 (1).fastp.zip…

@luckyvivi
Copy link
Author

@nreid
Copy link

nreid commented May 14, 2024

I have a similar issue, but with Nextera adapters. fastp says no contamination, FastQC says nextera, up to 10% by the read end. Even when I supply the Nextera fasta file (the one provided by trimmomatic) virtually no trimming happens.

Trimmomatic with ILLUMINACLIP:"${ADAPTERS}":2:30:10 SLIDINGWINDOW:4:25 MINLEN:45 and drops 7.25% of all reads.

This isn't a perfect comparison, I think fastp default min window Q is 20, not 25, but still. Something seems off here. I'm using v0.23.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants