Skip to content

Latest commit

 

History

History
82 lines (60 loc) · 2.57 KB

rayjobs.md

File metadata and controls

82 lines (60 loc) · 2.57 KB
title linkTitle date weight description
Run A RayJob
RayJobs
2023-05-18
6
Run a Kueue scheduled RayJob.

This page shows how to leverage Kueue's scheduling and resource management capabilities when running KubeRay's RayJob.

This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue's overview.

Before you begin

  1. Check Administer cluster quotas for details on the initial Kueue setup.

  2. See KubeRay Installation for installation and configuration details of KubeRay.

RayJob definition

When running RayJobs on Kueue, take into consideration the following aspects:

a. Queue selection

The target local queue should be specified in the metadata.labels section of the RayJob configuration.

metadata:
  labels:
    kueue.x-k8s.io/queue-name: user-queue

b. Configure the resource needs

The resource needs of the workload can be configured in the spec.rayClusterSpec.

    headGroupSpec:
      template:
        spec:
          containers:
            - resources:
                requests:
                  cpu: "1"
    workerGroupSpecs:
      - template:
          spec:
            containers:
              - resources:
                  requests:
                    cpu: "1"

c. Limitations

  • A Kueue managed RayJob cannot use an existing RayCluster.
  • The RayCluster should be deleted at the end of the job execution, spec.ShutdownAfterJobFinishes should be true.
  • Because Kueue will reserve resources for the RayCluster, spec.rayClusterSpec.enableInTreeAutoscaling should be false.
  • Because a Kueue workload can have a maximum of 8 PodSets, the maximum number of spec.rayClusterSpec.workerGroupSpecs is 7.

Example RayJob

In this example, the code is provided to the Ray framework via a ConfigMap.

{{< include "examples/jobs/ray-job-code-sample.yaml" "yaml" >}}

The RayJob looks like the following:

{{< include "examples/jobs/ray-job-sample.yaml" "yaml" >}}

You can run this RayJob with the following commands:

# Create the code ConfigMap (once)
kubectl apply -f ray-job-code-sample.yaml
# Create a RayJob. You can run this command multiple times
# to observe the queueing and admission of the jobs.
kubectl create -f ray-job-sample.yaml