title | linkTitle | date | weight | description |
---|---|---|---|---|
Run A RayJob |
RayJobs |
2023-05-18 |
6 |
Run a Kueue scheduled RayJob.
|
This page shows how to leverage Kueue's scheduling and resource management capabilities when running KubeRay's RayJob.
This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue's overview.
-
Check Administer cluster quotas for details on the initial Kueue setup.
-
See KubeRay Installation for installation and configuration details of KubeRay.
When running RayJobs on Kueue, take into consideration the following aspects:
The target local queue should be specified in the metadata.labels
section of the RayJob configuration.
metadata:
labels:
kueue.x-k8s.io/queue-name: user-queue
The resource needs of the workload can be configured in the spec.rayClusterSpec
.
headGroupSpec:
template:
spec:
containers:
- resources:
requests:
cpu: "1"
workerGroupSpecs:
- template:
spec:
containers:
- resources:
requests:
cpu: "1"
- A Kueue managed RayJob cannot use an existing RayCluster.
- The RayCluster should be deleted at the end of the job execution,
spec.ShutdownAfterJobFinishes
should betrue
. - Because Kueue will reserve resources for the RayCluster,
spec.rayClusterSpec.enableInTreeAutoscaling
should befalse
. - Because a Kueue workload can have a maximum of 8 PodSets, the maximum number of
spec.rayClusterSpec.workerGroupSpecs
is 7.
In this example, the code is provided to the Ray framework via a ConfigMap.
{{< include "examples/jobs/ray-job-code-sample.yaml" "yaml" >}}
The RayJob looks like the following:
{{< include "examples/jobs/ray-job-sample.yaml" "yaml" >}}
You can run this RayJob with the following commands:
# Create the code ConfigMap (once)
kubectl apply -f ray-job-code-sample.yaml
# Create a RayJob. You can run this command multiple times
# to observe the queueing and admission of the jobs.
kubectl create -f ray-job-sample.yaml