New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleaning by removing 'Dangling ends' fragments only? #164
Comments
Yes you can use Then use
|
Hi , @yermek2030 I encountered a similar problem so I was wondering if you succeeded in removing the peak around 100 bp? (and with which settings -i, -o, -d). Because I followed the instructions @kaukrise suggested above but the peak is still present. (using Additionally, @kaukrise I was wondering If you maybe have an explanation why I also have a peak around 10 bp? I don't understand why this peak is present and how I can remove this. Thank you in advance for helping me! |
Dear Simone,
I tried to do what @kaukrise suggested and it didn't work for me too. I did
not write a reply to @kaukrise here because of this - I just pressed the
'like' button for now. I was thinking to return to this issue when I had
more time. It would be great to hear how this issue can be solved.
… Message ID: ***@***.***>
|
Hi everyone, could you please post the complete commands you used to do the filtering? It is difficult to guess where exactly the filtering might have failed otherwise. Thank you! @SOlsthoorn I don't know why you have a peak around 10bp, either. Your restriction profile looks a bit unusual in general to me - the mean insert size is quite small, and there are no large fragments at all. But this could be related to your particular protocol |
Hi @kaukrise, First I created a pairs file using: Then I made a ligation error plot and a restriction site distance plot: and based on that I filtered the pairs file: So it seems like the peak around 100bp decreases a bit, but not completely disappeared and the peak around 10bp only increased in density ? Yes, the restriction profile does indeed appear unusual. I thought that perhaps the mean would shift a bit more towards 200 bp, and the pattern would resemble a more normal distribution if the high peaks at the beginning were removed. Thanks for your help! |
Hi @kaukrise, Sorry to bother you again with my issue. But could you maybe provide a more detailed explanation of the restriction site distance plot, what it represents and how it's constructed? On the FAN-C website, there is mention of using the -d parameter to filter read pairs based on their cumulative distance to the nearest restriction site. It's suggested that generating this re-dist plot, using only 10,000 read pairs, is valuable for determining -d and identifying library issues. Could you elaborate on this process? Maybe if I understand this better it can help me figure out the origin of short range peaks and find a way to remove them. Thanks again in advance for your help! |
Dear @kaukrise and @SimoneO98, Thank you for your patience.
|
Dear members of Vaquerizas lab,
I was wondering if there is a way to perform cleaning of the FASTQ or BAM files from the 'Dangling ends" fragments only? The necessity stems from the fact that I have performed BAM file alignment using standard BWA MEME and SAMTOOLS. Now when I plot the 'fragment size vs density' plot using FAN-C I have a peak at around ~100-110bp (Figure1), which disrupts the expected bell-like shape of the distribution we get with data from public repositories. Given that all the adaptors were removed prior to generating BAM files, I suspect that the peak comes from 'Dangling ends' as classified by the QC plot output of FAN-C, because the barplot for this type of contaminant is the only one that significantly differs from the (presumably 'clean') public datasets we use. If possible I would like to test my hypothesis by finding and removing 'dangling ends' fragments only.
Many thanks.
The text was updated successfully, but these errors were encountered: