Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLURM: specifying extra arguments for GPU binding #436

Open
BenWibking opened this issue Jan 29, 2024 · 4 comments
Open

SLURM: specifying extra arguments for GPU binding #436

BenWibking opened this issue Jan 29, 2024 · 4 comments

Comments

@BenWibking
Copy link

Is there a recommended way to specify extra SLURM options for GPU bindings?

I tried using the args: batch block key (https://maestrowf.readthedocs.io/en/latest/Maestro/scheduling.html), but the options did not get propagated to the *.sh job script.

Following #340, the workaround I've used so far is to specify these options as part of the run command so that they get copied into the job script:

    - name: run-sim
      description: Run the simulation.
      run:
          cmd: |
              #SBATCH --mem=0
              #SBATCH --constraint="scratch"
              #SBATCH --ntasks-per-node=4
              #SBATCH --cpus-per-task=16
              #SBATCH --gpus-per-task=1
              #SBATCH --gpu-bind=none

              srun bash -c "
                  export CUDA_VISIBLE_DEVICES=\$((3-SLURM_LOCALID));
                  $(BINARY_PATH) -i $(generate-infile.workspace)/params.in" > logfile.txt
          depends: [generate-infile]
          nodes: 1
          exclusive: True
          walltime: "00:10:00"
@jwhite242
Copy link
Collaborator

So it doesn't look like there's great handling of gpu's on the slurm adapter at the moment, despite there being a hook for adding the gpus=.. bit to the header which I think passes through on the steps' 'gpus: ' key along side nodes/procs/etc. Looks like the only extra one explicitly supported is 'cores per task'. Also note these are decoupled a bit in the script adapters: the header applies to the entire batch job (along with with the batch block keys), while many of the keys attached to the step get applied independently to each srun when using the $(LAUNCHER) syntax which has some limited support for independently specifying procs/nodes per launcher invocation.

And just to better understand the final use case, are you also looking for having say 4 different tasks (or srun's) inside this step, one per gpu, or perhaps preferring to keep each one separate and pack the allocation with many jobs using an embedded flux instance? Either way it looks like we'll need to wire up some extra hooks/pass through for these gpu related args in the slurm adapter. Think we could also add some 'c', and 'g' flags to the new launcher syntax if you want more independent control of multiple $(LAUNCHER) tokens in a step (see this new style launcher)

@BenWibking
Copy link
Author

BenWibking commented Jan 30, 2024

The use case for this job step is just a single MPI job across 1+ nodes. (Other job steps workflow steps are CPU-only, so they need to have a different binding/ntasks-per-node, but for now, that's a separate issue.)

The somewhat nonstandard options are just to get the right mapping of NUMA domains to GPUs due to the weird topology on this system, plus a workaround to avoid cgroup resource isolation being applied to the GPUs (since that prevents CUDA IPC from working between MPI ranks).

Using

#SBATCH --gpu-bind=none
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-node=4

might accomplish the same binding, but I haven't tested that yet. Is there a built-in way to specify this alternative set of SLURM options?

@jwhite242
Copy link
Collaborator

No, it doesn't look like there's a better way built in to set any extra/un-known sbatch options than what you're currently doing by putting them at the top of your step cmd.

Will have to look into exposing more of these options/bindings across the script adapters. Other than the options in your initial snippet, are there any other of the numerous sbatch/srun options you would be interested in?

@BenWibking
Copy link
Author

No, it doesn't look like there's a better way built in to set any extra/un-known sbatch options than what you're currently doing by putting them at the top of your step cmd.

Will have to look into exposing more of these options/bindings across the script adapters. Other than the options in your initial snippet, are there any other of the numerous sbatch/srun options you would be interested in?

Not that I can think of. The above examples should cover it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants