Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some trouble with the FastQC report #667

Open
Citrusyh opened this issue Apr 29, 2024 · 3 comments
Open

Some trouble with the FastQC report #667

Citrusyh opened this issue Apr 29, 2024 · 3 comments

Comments

@Citrusyh
Copy link

Hi Felix,
I am sorry to trouble you. I’m having some trouble with the FastQC report and would like to ask you.

  1. According to the “Per base sequence content”, should I clip the first 6 bp for good results? And this curve doesn’t look smooth.
  2. According to the “Sequence Duplication Levels”, why is there only one line here? What’s wrong with my code?
  3. There are so many overrepresented sequences, is it normal when dealing with RRBS data?
    I had input code like this:
  4. trim_galore
    trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 --paired A61.1.fq.gz A61.2.fq.gz -o /export/home/***
  5. fastqc
    fastqc -o /export/home/limiao29/RRBS/Lung/fastqc -t 12 /export/home/limiao29/RRBS/Lung/*.fq.gz

屏幕截图 2024-04-29 222253
屏幕截图 2024-04-29 222310
屏幕截图 2024-04-29 222333
图片1
图片2

@FelixKrueger
Copy link
Owner

RRBS data is weird, as by definition you are only sequencing a very small subset of the genome (hence: reduced representation). Depending on the specific protocol and genome there are only a few hundred thousand possible fragments you expect to sequence, and you've got > 30 million reads. So naturally, you will sequence the same fragments several times, and evidently some of them are highly over-represented.

This isn't really something you can do much about, (maybe with the exception of deduplicating using UMIs), but it just comes with the method. The same also goes for the base composition, it is expected. The only thing that needs (hard-)trimming are the filled-in bases from the end-repair reaction. Is this by any chance the Diagenode v2 kit by any chance?

@Citrusyh
Copy link
Author

I am sorry to tell you that I know little about this, because I paid for company to do this experiment. I will ask the company for more details. thank you for your kind reply!

@FelixKrueger
Copy link
Owner

If it happens to be the Diagenode v2 RRBS kit, there was recently a discussion as well as some processing tips here: FelixKrueger/TrimGalore#177 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants