Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bismark Alignment very slow #670

Open
Nitin123-4 opened this issue May 13, 2024 · 7 comments
Open

Bismark Alignment very slow #670

Nitin123-4 opened this issue May 13, 2024 · 7 comments

Comments

@Nitin123-4
Copy link

Nitin123-4 commented May 13, 2024

Hi Team,

I am running Bismark with the below command. I can see it's really slow.

bismark --bowtie2 -N 1 --parallel 4 $RESOURCES2/HG38/ -1 $Read1 -2 $Read2 --output_dir $PWD --temp_dir $PWD/$SAMPLEID"_TEMP" --prefix $SAMPLEID

I have 218,289,382 total reads i.e. 32.96(Gb) data. I did pre processing using Trimmomatic. Filtered reads are used for this.
It took ~22 h to complete. Can you please help with this?

Bismark Version: v0.24.0

@FelixKrueger
Copy link
Owner

Increasing the the mismatches from the default (0) to -N 1 is probably slowing things down markedly. Lowering this and/or increasing the --parallel are your best options.

@Nitin123-4
Copy link
Author

Nitin123-4 commented May 13, 2024

Thanks for the quick reply.

WARNING: Bismark Parallel (BP?) is resource hungry! Each value of --parallel specified
will effectively lead to a linear increase in compute and memory requirements, so --parallel 4 for
e.g. the GRCm38 mouse genome will probably use ~20 cores and eat ~40GB or RAM, but at the same time
reduce the alignment time to ~25-30%. You have been warned.

With --parallel 4 it is taking ~40GB RAM. If we increase it to --parallel 8 or 10 it will take a lot of memory.

I think this is difficult as it needs more RAM also.

Also Is it recommended to use mismatches (0) ?

@FelixKrueger
Copy link
Owner

This is from the Bowtie 2 manual:

-N <int>

Sets the number of mismatches to allowed in a seed alignment during multiseed alignment. Can be set to 0 or 1. Setting this higher makes alignment slower (often much slower) but increases sensitivity. Default: 0.

To be honest, I don't think I have ever changes this to 1 ever. It only really requires one multi-seed alignment in the read somewhere as anchor, which in my experience is pretty much always the case.

@Nitin123-4
Copy link
Author

Okay. Thanks for your response.

Please suggest about : With --parallel 4 it is taking ~40GB RAM. If we increase it to --parallel 8 or 10 it will take a lot of memory.

Is there any solution for this?

@FelixKrueger
Copy link
Owner

I am afraid there isn't really a solution for this, if you ask for more memory it will also use more... You could potentially try to give each Bowtie2 thread some additional core (e.g. -p 4), but this will only get you so far to be honest. It should not use much more memory though. This could result in the following command line:

bismark -p 4 --parallel 4 $RESOURCES2/HG38/ -1 $Read1 -2 $Read2 --output_dir $PWD --temp_dir $PWD/$SAMPLEID"_TEMP" --prefix $SAMPLEID

@Nitin123-4
Copy link
Author

Nitin123-4 commented May 13, 2024

Okay, so it means it will use 16 CPUs in total and ~40GB RAM?

@FelixKrueger
Copy link
Owner

If you open a second terminal and run top you should be able to monitor usage statistics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants