From 506a104b3684b88477f483fc78d126aa09a3d347 Mon Sep 17 00:00:00 2001
From: John Blischak <jdblischak@gmail.com>
Date: Wed, 9 Mar 2022 11:02:04 -0800
Subject: [PATCH] Document support for multi-cluster status scripts. Close #6

https://github.com/snakemake/snakemake/pull/1459
https://github.com/snakemake/snakemake/pull/977
---
 CHANGELOG.md                              |  3 +++
 README.md                                 | 26 +++++++++++------------
 examples/README.md                        |  3 +++
 examples/multi-cluster/README.md          | 26 ++++++++++++++++++-----
 examples/multi-cluster/Snakefile          |  7 ++++++
 examples/multi-cluster/simple/config.yaml |  4 +++-
 6 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index d51ee89..2501ff4 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,8 @@
 ## dev
 
+Full support for custom status scripts in a multi-cluster setup (requires
+minimum of Snakemake version 7.1.1)
+
 Example with [cluster-cancel][] (requires minimum of Snakemake version 7.0.0)
 
 [cluster-cancel]: https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html#using-cluster-cancel
diff --git a/README.md b/README.md
index d411dcd..ec1ad4c 100644
--- a/README.md
+++ b/README.md
@@ -46,15 +46,17 @@ post][sichong-post] by Sichong Peng nicely explains this strategy for replacing
   [`extras/`](extras/)) to [`--cluster-status`][cluster-status] to additionally
   handle the job statuses TIMEOUT and CANCELED
 
+* **New** Support for cluster-cancel
+
+* **New** Full support for [multi-cluster setups][multi_cluster] (using a custom
+  status script requires Snakemake 7.1.1+). See the section [Multiple
+  clusters][#multiple-clusters] below
+
 ## Limitations
 
 * Can't use [group jobs][grouping], but they [aren't easy to use in the first
   place][grouping-issue]
 
-* Limited support for [multi-cluster setups][multi_cluster] (please upvote my
-  [PR][pr-multi-cluster] to enable support for using custom scripts with
-  `--cluster-status` in a multi-cluster setup)
-
 * Wildcards can't contain `/` if you want to use them in the name of the Slurm
   log file. This is a Slurm requirement (which makes sense, since it has to
   create a file on the filesystem). You'll either have to change how you manage
@@ -271,18 +273,15 @@ documentation below.
     ```python
     # Snakefile
     rule different_cluster:
-      resources:
-        clusters = "c2"
+        resources:
+            clusters="c2"
     ```
 
-1. It's currently not possible to use a custom cluster status script with
-   multi-cluster. After you add the flag `--parsable` to `sbatch`, it will
+1. Using a custom cluster status script in a multi-cluster setup requires
+   Snakemake 7.1.1+. After you add the flag `--parsable` to `sbatch`, it will
    return `jobid;cluster_name`. I adapted `status-sacct.sh` to handle this
-   situation. However, Snakemake doesn't quote the argument, so the semi-colon
-   causes it to try and execute a program that is the name of the cluster.
-   Please see [`examples/multi-cluster/`](examples/multi-cluster) to try out my
-   latest attempt. Also, please upvote my [PR][pr-multi-cluster] to fix this in
-   Snakemake.
+   situation. Please see [`examples/multi-cluster/`](examples/multi-cluster) to
+   try out `status-sacct-multi.sh`
 
 ## Use speed with caution
 
@@ -321,6 +320,5 @@ warranties. To make it official, it's released under the [CC0][] license. See
 [min_version]: https://snakemake.readthedocs.io/en/stable/snakefiles/writing_snakefiles.html#depend-on-a-minimum-snakemake-version
 [multi_cluster]: https://slurm.schedmd.com/multi_cluster.html
 [no-cluster-status]: http://bluegenes.github.io/Using-Snakemake_Profiles/
-[pr-multi-cluster]: https://github.com/snakemake/snakemake/pull/977
 [sichong-post]: https://www.sichong.site/2020/02/25/snakemake-and-slurm-how-to-manage-workflow-with-resource-constraint-on-hpc/
 [slurm-official]: https://github.com/Snakemake-Profiles/slurm
diff --git a/examples/README.md b/examples/README.md
index 55b1c72..f7d1611 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -12,6 +12,9 @@ them on your cluster to confirm the expected behavior.
 * `jobs-per-second` - Measures how many jobs Snakemake can submit to Slurm per
   second
 
+* `multi-cluster` - Submit jobs to more than one cluster (requires Snakemake
+  7.1.1+)
+
 * `out-of-memory` - Triggers an out-of-memory error. Snakemake can handle this
   by default
 
diff --git a/examples/multi-cluster/README.md b/examples/multi-cluster/README.md
index ddcfd7e..5814067 100644
--- a/examples/multi-cluster/README.md
+++ b/examples/multi-cluster/README.md
@@ -1,11 +1,27 @@
 # Multiple clusters
 
-**Warning:** Work in progress
+**New feature:** The custom cluster status script for a multi-cluster setup is
+now supported as of Snakemake 7.1.1 (see `simple/status-sacct-multi.sh`)
 
-Submit jobs to a specific cluster. Edit `simple/config.yml` to add the name of
-one or more of your clusters to the argument `--clusters` (separated by commas).
-Run `sacctmgr --parsable show clusters | cut -d'|' -f1` to view the names of the
-available clusters.
+Submit jobs to a specific cluster. Edit the `Snakefile` rule `cluster_name` to
+add the name of one or more of your clusters to the resouce `clusters`
+(separated by commas). Run `sacctmgr --parsable show clusters | cut -d'|' -f1`
+to view the names of the available clusters.
+
+```python
+rule cluster_name:
+    output:
+        "output/cluster.txt",
+    resources:
+        clusters="<cluster-name>",
+```
+
+You can also change the default cluster in `simple/config.yaml`:
+
+```yaml
+default-resources:
+  - clusters=<default-cluster>
+```
 
 The example rule writes the name of the cluster to `output/cluster.txt`.
 
diff --git a/examples/multi-cluster/Snakefile b/examples/multi-cluster/Snakefile
index cdbcd98..21de28d 100644
--- a/examples/multi-cluster/Snakefile
+++ b/examples/multi-cluster/Snakefile
@@ -1,6 +1,13 @@
+import snakemake.utils
+
+snakemake.utils.min_version("7.1.1")
+
+
 rule cluster_name:
     output:
         "output/cluster.txt",
+    resources:
+        clusters="slurm_cluster",
     shell:
         """
         sleep 5s
diff --git a/examples/multi-cluster/simple/config.yaml b/examples/multi-cluster/simple/config.yaml
index e4662c2..68d1291 100644
--- a/examples/multi-cluster/simple/config.yaml
+++ b/examples/multi-cluster/simple/config.yaml
@@ -1,10 +1,12 @@
 cluster:
   mkdir -p logs/{rule} &&
   sbatch
-    --clusters=slurm_cluster
+    --clusters={resources.clusters}
     --job-name=smk-{rule}-{wildcards}
     --output=logs/{rule}/{rule}-{wildcards}-%j.out
     --parsable
+default-resources:
+  - clusters=slurm_cluster
 jobs: 1
 printshellcmds: True
 cluster-status: status-sacct-multi.sh