Skip to content

Commit

Permalink
Document support for multi-cluster status scripts. Close #6
Browse files Browse the repository at this point in the history
  • Loading branch information
jdblischak committed Mar 9, 2022
1 parent 78846d9 commit 506a104
Show file tree
Hide file tree
Showing 6 changed files with 49 additions and 20 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
@@ -1,5 +1,8 @@
## dev

Full support for custom status scripts in a multi-cluster setup (requires
minimum of Snakemake version 7.1.1)

Example with [cluster-cancel][] (requires minimum of Snakemake version 7.0.0)

[cluster-cancel]: https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html#using-cluster-cancel
Expand Down
26 changes: 12 additions & 14 deletions README.md
Expand Up @@ -46,15 +46,17 @@ post][sichong-post] by Sichong Peng nicely explains this strategy for replacing
[`extras/`](extras/)) to [`--cluster-status`][cluster-status] to additionally
handle the job statuses TIMEOUT and CANCELED

* **New** Support for cluster-cancel

* **New** Full support for [multi-cluster setups][multi_cluster] (using a custom
status script requires Snakemake 7.1.1+). See the section [Multiple
clusters][#multiple-clusters] below

## Limitations

* Can't use [group jobs][grouping], but they [aren't easy to use in the first
place][grouping-issue]

* Limited support for [multi-cluster setups][multi_cluster] (please upvote my
[PR][pr-multi-cluster] to enable support for using custom scripts with
`--cluster-status` in a multi-cluster setup)

* Wildcards can't contain `/` if you want to use them in the name of the Slurm
log file. This is a Slurm requirement (which makes sense, since it has to
create a file on the filesystem). You'll either have to change how you manage
Expand Down Expand Up @@ -271,18 +273,15 @@ documentation below.
```python
# Snakefile
rule different_cluster:
resources:
clusters = "c2"
resources:
clusters="c2"
```

1. It's currently not possible to use a custom cluster status script with
multi-cluster. After you add the flag `--parsable` to `sbatch`, it will
1. Using a custom cluster status script in a multi-cluster setup requires
Snakemake 7.1.1+. After you add the flag `--parsable` to `sbatch`, it will
return `jobid;cluster_name`. I adapted `status-sacct.sh` to handle this
situation. However, Snakemake doesn't quote the argument, so the semi-colon
causes it to try and execute a program that is the name of the cluster.
Please see [`examples/multi-cluster/`](examples/multi-cluster) to try out my
latest attempt. Also, please upvote my [PR][pr-multi-cluster] to fix this in
Snakemake.
situation. Please see [`examples/multi-cluster/`](examples/multi-cluster) to
try out `status-sacct-multi.sh`

## Use speed with caution

Expand Down Expand Up @@ -321,6 +320,5 @@ warranties. To make it official, it's released under the [CC0][] license. See
[min_version]: https://snakemake.readthedocs.io/en/stable/snakefiles/writing_snakefiles.html#depend-on-a-minimum-snakemake-version
[multi_cluster]: https://slurm.schedmd.com/multi_cluster.html
[no-cluster-status]: http://bluegenes.github.io/Using-Snakemake_Profiles/
[pr-multi-cluster]: https://github.com/snakemake/snakemake/pull/977
[sichong-post]: https://www.sichong.site/2020/02/25/snakemake-and-slurm-how-to-manage-workflow-with-resource-constraint-on-hpc/
[slurm-official]: https://github.com/Snakemake-Profiles/slurm
3 changes: 3 additions & 0 deletions examples/README.md
Expand Up @@ -12,6 +12,9 @@ them on your cluster to confirm the expected behavior.
* `jobs-per-second` - Measures how many jobs Snakemake can submit to Slurm per
second

* `multi-cluster` - Submit jobs to more than one cluster (requires Snakemake
7.1.1+)

* `out-of-memory` - Triggers an out-of-memory error. Snakemake can handle this
by default

Expand Down
26 changes: 21 additions & 5 deletions examples/multi-cluster/README.md
@@ -1,11 +1,27 @@
# Multiple clusters

**Warning:** Work in progress
**New feature:** The custom cluster status script for a multi-cluster setup is
now supported as of Snakemake 7.1.1 (see `simple/status-sacct-multi.sh`)

Submit jobs to a specific cluster. Edit `simple/config.yml` to add the name of
one or more of your clusters to the argument `--clusters` (separated by commas).
Run `sacctmgr --parsable show clusters | cut -d'|' -f1` to view the names of the
available clusters.
Submit jobs to a specific cluster. Edit the `Snakefile` rule `cluster_name` to
add the name of one or more of your clusters to the resouce `clusters`
(separated by commas). Run `sacctmgr --parsable show clusters | cut -d'|' -f1`
to view the names of the available clusters.

```python
rule cluster_name:
output:
"output/cluster.txt",
resources:
clusters="<cluster-name>",
```

You can also change the default cluster in `simple/config.yaml`:

```yaml
default-resources:
- clusters=<default-cluster>
```

The example rule writes the name of the cluster to `output/cluster.txt`.

Expand Down
7 changes: 7 additions & 0 deletions examples/multi-cluster/Snakefile
@@ -1,6 +1,13 @@
import snakemake.utils

snakemake.utils.min_version("7.1.1")


rule cluster_name:
output:
"output/cluster.txt",
resources:
clusters="slurm_cluster",
shell:
"""
sleep 5s
Expand Down
4 changes: 3 additions & 1 deletion examples/multi-cluster/simple/config.yaml
@@ -1,10 +1,12 @@
cluster:
mkdir -p logs/{rule} &&
sbatch
--clusters=slurm_cluster
--clusters={resources.clusters}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--parsable
default-resources:
- clusters=slurm_cluster
jobs: 1
printshellcmds: True
cluster-status: status-sacct-multi.sh

0 comments on commit 506a104

Please sign in to comment.