title | date | weight | description |
---|---|---|---|
Run a MPIJob |
2023-05-16 |
6 |
Run a Kueue scheduled MPIJob
|
This page shows how to leverage Kueue's scheduling and resource management capabilities when running MPI Operator MPIJobs.
This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue's overview.
Check administer cluster quotas for details on the initial cluster setup.
Check the MPI Operator installation guide.
You can modify kueue configurations from installed releases to include MPIJobs as an allowed workload.
The target local queue should be specified in the metadata.labels
section of the MPIJob configuration.
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
spec:
runPolicy:
suspend: true
By default, Kueue will set suspend
to true via webhook and unsuspend it when the MPIJob is admitted.
This example is based on https://github.com/kubeflow/mpi-operator/blob/ccf2756f749336d652fa6b10a732e241a40c7aa6/examples/v2beta1/pi/pi.yaml.
{{< include "examples/jobs/sample-mpijob.yaml" "yaml" >}}
For equivalent instructions for doing this in Python, see Run Python Jobs.