Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threading question #85

Open
rsettlage opened this issue Mar 5, 2021 · 7 comments
Open

threading question #85

rsettlage opened this issue Mar 5, 2021 · 7 comments
Assignees

Comments

@rsettlage
Copy link

Hi, thanks for this pipeline, loving it. BUT am not yet a snakemake guru.

I have a question regarding optimizing of the compute. Seems like I can run the pipeline two ways:

  1. start each sample independently
  2. start the batch

If I do (1), I am starting vpipe with the --cores 128 option (AMD server with 128 physical cores) but it seems to use only 4 threads for those sub-programs that can use them. In the vpipe config files, I see the threads option, but that seems to be set to 1. So, where did it get the 4 and is there an easy way to change that globally? --threads=128 or something?

If I do (2), is there a way to specify the number of samples that should be processed simultaneously AND similar to above, the threads to use for each process? Something like process 8 samples at a time using 16 threads each.

Thanks
Bob

@DrYak DrYak assigned DrYak and kpj Mar 5, 2021
@rsettlage
Copy link
Author

FYI, I do the section in the docs that says the default is 4, I am more curious where that is set when the values I see in the config file suggest 1.

@DrYak
Copy link
Member

DrYak commented Mar 6, 2021

First regarding 1 vs 2:

Snakemake doesn't really have an internal notion of samples it only considers jobs. It builds a DAG of all jobs that needs to be run, and then runs them as soon as all of their dependencies are met (e.g.: SNV calling needs first an alignment and won't start before) and as soon as enough resources are free (e.g.: enough threads are available).

Currently, it parses the DAG breadth-first (so it will tend to run most of the samples in parallel - i.e.: the alignment jobs will tend to be all called before the SNV calling jobs).

So if you want each sample to be processed separately, you would need to run a whole snakemake separately for each.

@DrYak
Copy link
Member

DrYak commented Mar 6, 2021

Now for your questions:

regarding threads:

  • each rule takes its number of threads (snakemake's threads: directive) from the "threads=" parameter in the config file.
  • the defaults are currently in the file rules/config_default.smk
  • by default (threads=0 for a specific rule) it falls back to the global threads= setting in the [general] section (see here) which 4 by default.

@DrYak
Copy link
Member

DrYak commented Mar 6, 2021

now regarding fine tuning you configuration file:

calling SNVs is done by default using ShoRAH which work in independant local windows. Thus it is an embarrassingly-parallel type of problems and can scale to more threads (currently we run 64 concurrent threads on our thread rippers), requesting 1GiB of RAM per thread on average which works most of the time.

[snv]
consensus=false
time=240
threads=64
mem=1024
localscratch=$TMPDIR

bwa (the default aligner for SARS-CoV-2) works in batches of ~1 million reads.
according to litterature, it is able to work with up to 8~16 threads until other contention diminishes any further parallelization. In our case, a very large proportion (3/4) of the samples are processed in 6 batches, so it's not worth requesting further threads:

[bwa_align]
mem=2048
threads=6

@DrYak
Copy link
Member

DrYak commented Mar 6, 2021

for running specifically 8 samples in parallel and allocating exactly 16 threads on each:

It's not easily done in the current way vpipe is written.
Maybe it's possible to experiment with the --batch parameter of snakemake but I lack experience on this side.

Another approach would be to split your sample file in batches of 8 samples and run them separately. But in that case, you better use the consensus=false option mentioned above so SNV are called against the reference (e.g.: for SARS-CoV-2 that would be NC_045512) and not against each batch consensus (this would make it very difficult to compare between batches).

@DrYak
Copy link
Member

DrYak commented Mar 6, 2021

last, a different approach if you run on HPC (and not on a single 128 core workstation) would be to let snakemake dispatch jobs on the cluster using its --cluster option.

@rsettlage
Copy link
Author

Thanks, awesome information. I am indeed on an HPC system. I think I have settled on running each sample independently and am hoping the cluster option scales the various steps (jobs) according to threads used. For samples that seem to be more diverse, the last step of making the json file is painfully slow. I noticed just prior to that, there are 10 partitioned vcf files, would it be computationally (time) more efficient to process those individually and then combine sub-jsons?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants