From 506a104b3684b88477f483fc78d126aa09a3d347 Mon Sep 17 00:00:00 2001 From: John Blischak Date: Wed, 9 Mar 2022 11:02:04 -0800 Subject: [PATCH] Document support for multi-cluster status scripts. Close #6 https://github.com/snakemake/snakemake/pull/1459 https://github.com/snakemake/snakemake/pull/977 --- CHANGELOG.md | 3 +++ README.md | 26 +++++++++++------------ examples/README.md | 3 +++ examples/multi-cluster/README.md | 26 ++++++++++++++++++----- examples/multi-cluster/Snakefile | 7 ++++++ examples/multi-cluster/simple/config.yaml | 4 +++- 6 files changed, 49 insertions(+), 20 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d51ee89..2501ff4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,8 @@ ## dev +Full support for custom status scripts in a multi-cluster setup (requires +minimum of Snakemake version 7.1.1) + Example with [cluster-cancel][] (requires minimum of Snakemake version 7.0.0) [cluster-cancel]: https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html#using-cluster-cancel diff --git a/README.md b/README.md index d411dcd..ec1ad4c 100644 --- a/README.md +++ b/README.md @@ -46,15 +46,17 @@ post][sichong-post] by Sichong Peng nicely explains this strategy for replacing [`extras/`](extras/)) to [`--cluster-status`][cluster-status] to additionally handle the job statuses TIMEOUT and CANCELED +* **New** Support for cluster-cancel + +* **New** Full support for [multi-cluster setups][multi_cluster] (using a custom + status script requires Snakemake 7.1.1+). See the section [Multiple + clusters][#multiple-clusters] below + ## Limitations * Can't use [group jobs][grouping], but they [aren't easy to use in the first place][grouping-issue] -* Limited support for [multi-cluster setups][multi_cluster] (please upvote my - [PR][pr-multi-cluster] to enable support for using custom scripts with - `--cluster-status` in a multi-cluster setup) - * Wildcards can't contain `/` if you want to use them in the name of the Slurm log file. This is a Slurm requirement (which makes sense, since it has to create a file on the filesystem). You'll either have to change how you manage @@ -271,18 +273,15 @@ documentation below. ```python # Snakefile rule different_cluster: - resources: - clusters = "c2" + resources: + clusters="c2" ``` -1. It's currently not possible to use a custom cluster status script with - multi-cluster. After you add the flag `--parsable` to `sbatch`, it will +1. Using a custom cluster status script in a multi-cluster setup requires + Snakemake 7.1.1+. After you add the flag `--parsable` to `sbatch`, it will return `jobid;cluster_name`. I adapted `status-sacct.sh` to handle this - situation. However, Snakemake doesn't quote the argument, so the semi-colon - causes it to try and execute a program that is the name of the cluster. - Please see [`examples/multi-cluster/`](examples/multi-cluster) to try out my - latest attempt. Also, please upvote my [PR][pr-multi-cluster] to fix this in - Snakemake. + situation. Please see [`examples/multi-cluster/`](examples/multi-cluster) to + try out `status-sacct-multi.sh` ## Use speed with caution @@ -321,6 +320,5 @@ warranties. To make it official, it's released under the [CC0][] license. See [min_version]: https://snakemake.readthedocs.io/en/stable/snakefiles/writing_snakefiles.html#depend-on-a-minimum-snakemake-version [multi_cluster]: https://slurm.schedmd.com/multi_cluster.html [no-cluster-status]: http://bluegenes.github.io/Using-Snakemake_Profiles/ -[pr-multi-cluster]: https://github.com/snakemake/snakemake/pull/977 [sichong-post]: https://www.sichong.site/2020/02/25/snakemake-and-slurm-how-to-manage-workflow-with-resource-constraint-on-hpc/ [slurm-official]: https://github.com/Snakemake-Profiles/slurm diff --git a/examples/README.md b/examples/README.md index 55b1c72..f7d1611 100644 --- a/examples/README.md +++ b/examples/README.md @@ -12,6 +12,9 @@ them on your cluster to confirm the expected behavior. * `jobs-per-second` - Measures how many jobs Snakemake can submit to Slurm per second +* `multi-cluster` - Submit jobs to more than one cluster (requires Snakemake + 7.1.1+) + * `out-of-memory` - Triggers an out-of-memory error. Snakemake can handle this by default diff --git a/examples/multi-cluster/README.md b/examples/multi-cluster/README.md index ddcfd7e..5814067 100644 --- a/examples/multi-cluster/README.md +++ b/examples/multi-cluster/README.md @@ -1,11 +1,27 @@ # Multiple clusters -**Warning:** Work in progress +**New feature:** The custom cluster status script for a multi-cluster setup is +now supported as of Snakemake 7.1.1 (see `simple/status-sacct-multi.sh`) -Submit jobs to a specific cluster. Edit `simple/config.yml` to add the name of -one or more of your clusters to the argument `--clusters` (separated by commas). -Run `sacctmgr --parsable show clusters | cut -d'|' -f1` to view the names of the -available clusters. +Submit jobs to a specific cluster. Edit the `Snakefile` rule `cluster_name` to +add the name of one or more of your clusters to the resouce `clusters` +(separated by commas). Run `sacctmgr --parsable show clusters | cut -d'|' -f1` +to view the names of the available clusters. + +```python +rule cluster_name: + output: + "output/cluster.txt", + resources: + clusters="", +``` + +You can also change the default cluster in `simple/config.yaml`: + +```yaml +default-resources: + - clusters= +``` The example rule writes the name of the cluster to `output/cluster.txt`. diff --git a/examples/multi-cluster/Snakefile b/examples/multi-cluster/Snakefile index cdbcd98..21de28d 100644 --- a/examples/multi-cluster/Snakefile +++ b/examples/multi-cluster/Snakefile @@ -1,6 +1,13 @@ +import snakemake.utils + +snakemake.utils.min_version("7.1.1") + + rule cluster_name: output: "output/cluster.txt", + resources: + clusters="slurm_cluster", shell: """ sleep 5s diff --git a/examples/multi-cluster/simple/config.yaml b/examples/multi-cluster/simple/config.yaml index e4662c2..68d1291 100644 --- a/examples/multi-cluster/simple/config.yaml +++ b/examples/multi-cluster/simple/config.yaml @@ -1,10 +1,12 @@ cluster: mkdir -p logs/{rule} && sbatch - --clusters=slurm_cluster + --clusters={resources.clusters} --job-name=smk-{rule}-{wildcards} --output=logs/{rule}/{rule}-{wildcards}-%j.out --parsable +default-resources: + - clusters=slurm_cluster jobs: 1 printshellcmds: True cluster-status: status-sacct-multi.sh