In the latest version, the job cannot run because the gpu quota is set in the queue #3426

ffz12 · 2024-04-19T02:52:40Z

What happened:

In the latest version, the job cannot run because the gpu quota is set in the queue

What you expected to happen:

gpu quotas can be set for queues

How to reproduce it (as minimally and precisely as possible):

Sufficient node resources
1、The queue configuration file is as follows:
a800.yaml

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: a800
spec:
    reclaimable: true
    weight: 1
    capability:
      nvidia.com/gpu: "4"
      cpu: "5"

2、The job configuration file is as follows:
job1.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: job-1
spec:
  minAvailable: 1
  schedulerName: volcano
  queue: a800
  policies:
    - event: PodEvicted
      action: RestartJob
  tasks:
    - replicas: 1
      name: nginx
      policies:
      - event: TaskCompleted
        action: CompleteJob
      template:
        spec:
          containers:
            - command:
              - sleep
              - 10m
              image: harbor.unijn.cn/zhaofengfeng/dev:v1
              name: nginx
              resources:
                requests:
                  cpu: 4
                  nvidia.com/gpu: "3"
                limits:
                  cpu: 4
                  nvidia.com/gpu: "3"
          restartPolicy: Never

3、pending occurs after running：

 kubectl  apply -f job1.yaml

 kubectl get vcjob
NAME    STATUS    MINAVAILABLE   RUNNINGS   AGE
job-1   Pending   1                         22s

kubectl describe vcjob job-1
。。。
 Warning  PodGroupPending  63s   vc-controller-manager  PodGroup default:job-1 unschedule,reason: 1/0 tasks in gang unschedulable: pod group is not ready, 1 minAvailable

4、Earlier version 1.72 gup quota function is normal
cat queue_a6k_ada.yaml

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: a6kada
spec:
    reclaimable: true
    weight: 1
    capability:
      nvidia.com/gpu: "1"
    affinity:            # added field
      nodeGroupAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - A6k_ada

cat job2.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: job-3
spec:
  minAvailable: 1
  schedulerName: volcano
  queue: a6kada
  policies:
    - event: PodEvicted
      action: RestartJob
  tasks:
    - replicas: 1
      name: nginx
      policies:
      - event: TaskCompleted
        action: CompleteJob
      template:
        spec:
          containers:
            - command:
              - sleep
              - 10m
              image: nginx:latest
              name: nginx
              resources:
                requests:
                  cpu: 1
                  nvidia.com/gpu: "1"
                limits:
                  cpu: 1
                  nvidia.com/gpu: "1"
          restartPolicy: Never

 kubectl get po
NAME            READY   STATUS    RESTARTS   AGE
job-3-nginx-0   1/1     Running   0          46s

Anything else we need to know?:

Environment:

Volcano Version:
Kubernetes version (use kubectl version): 1.23.10
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release): CentOS Linux release 7.9.2009 (Core)
Kernel (e.g. uname -a): 3.10.0-1160.el7.x86_64
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

lowang-bh · 2024-04-19T03:43:28Z

please use yaml in markdown to format the yamls. Thanks

Monokaix · 2024-04-19T06:24:05Z

Please also paste volcano scheduler logs: )

ffz12 · 2024-04-19T06:50:27Z

1.log

lowang-bh · 2024-04-21T10:18:20Z

1.log

log shows it is skipped becuase podgroup is pending, not inqueue status.

I0419 06:46:59.356535       1 enqueue.go:79] Try to enqueue PodGroup to 1 Queues
I0419 06:46:59.356568       1 overcommit.go:123] Sufficient resources, permit job <default/job-1-ee4a4c4d-932b-4a74-872d-bb31c0565b47> to be inqueue
I0419 06:46:59.356670       1 allocate.go:74] Job <default/job-1-ee4a4c4d-932b-4a74-872d-bb31c0565b47> Queue <a800> skip allocate, reason: job status is pending.
I0419 06:46:59.356689       1 allocate.go:64] Try to allocate resource to 0 Queues

ffz12 · 2024-04-22T01:39:53Z

Similarly, the latest version of the operation shows that podgroup is suspended, no cause can be found, and resources are sufficient。This error with podgroup was seen at the time, but don't know how to fix it。

ffz12 · 2024-04-23T08:20:29Z

So how do we solve this problem, that the gpu capability of the newer version of queue is not available, but the older version is available
Like the following

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: a800
spec:
    reclaimable: true
    weight: 1
    capability:
      nvidia.com/gpu: "4"

lowang-bh · 2024-04-24T07:10:00Z

You should check the scheduler-config, and change scheduler log level to a large one to see which plugin reject podgroup to become inqueue status.

Monokaix · 2024-04-24T11:51:39Z

Plz paste shceduler configmap，and try to restart volcano scheduler.

ffz12 · 2024-04-25T03:09:10Z

1、volcano-scheduler Specifies the appropriate log level to be set
2、volcano-scheduler-configmap as follows

apiVersion: v1
data:
  volcano-scheduler.conf: |
    actions: "enqueue, allocate, backfill, reclaim, preempt"
    tiers:
    - plugins:
      - name: priority
      - name: gang
        enablePreemptable: false
      - name: conformance
    - plugins:
      - name: overcommit
      - name: drf
        enablePreemptable: false
      - name: predicates
      - name: proportion
      - name: nodeorder
      - name: nodegroup
      - name: binpack
        arguments:
          binpack.weight: 10
          binpack.cpu: 1
          binpack.memory: 1
          binpack.resources: nvidia.com/gpu
          binpack.resources.nvidia.com/gpu: 8
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"volcano-scheduler.conf":"actions: \"enqueue, allocate, backfill\"\ntiers:\n- plugins:\n  - name: priority\n  - name: gang\n    enablePreemptable: false\n  - name: conformance\n- plugins:\n  - name: overcommit\n  - name: drf\n    enablePreemptable: false\n  - name: predicates\n  - name: proportion\n  - name: nodeorder\n  - name: binpack\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"volcano-scheduler-configmap","namespace":"volcano-system"}}
  creationTimestamp: "2024-04-10T03:18:37Z"
  name: volcano-scheduler-configmap
  namespace: volcano-system
  resourceVersion: "120372887"
  uid: 122c0b0b-fea7-4aa7-873d-3017bcb7722c

ffz12 · 2024-04-25T06:55:00Z

The setting log is basically 5. The error log about job1 is as follows:

Added Queue attributes.
I0425 06:28:19.398670 I0425 06:28:19.398730 I0425 06:28:19.398782 I0425 06:28:19.398825 I0425 06:28:19.398866 I0425 06:28:19.398925 I0425 06:28:19.398992 I0425 06:28:19.399086 I0425 06:28:19.399131 I0425 06:28:19.399148 I0425 06:28:19.399169 I0425 06:28:19.399187 I0425 06:28:19.399206 I0425 06:28:19.399233 I0425 06:28:19.399254 I0425 06:28:19.399286 I0425 06:28:19.399339 I0425 06:28:19.399368 I0425 06:28:19.399421 I0425 06:28:19.399440 I0425 06:28:19.399454 I0425 06:28:19.399468 1 proportion.go:158] Queue a800 allocated <cpu 0.00, memory 0.00> request <cpu 0.00, memory 0.00> inqueue <cpu 0.00, memory 0.00> elastic <cpu 0.00, memory 0.00>
1 proportion.go:204] Considering Queue : weight <1>, total weight <1>.
1 proportion.go:220] Format queue deserved resource to <cpu 0.00, memory 0.00>
1 proportion.go:224] queue is meet
1 proportion.go:230] The attributes of queue in proportion: deserved <cpu 0.00, memory 0.00>, realCapability <cpu 2703200.00, memory 41159590465010.00, nvidia.com/gpu 5000.00, nvidia.com/hostdev_2 0.00, pods 0.00, nvidia.com/hostdev_1 0.00, ephemeral-storage 0.00, hugepages-1Gi 0.00, hugepages-2Mi 0.00>, allocate <cpu 0.00, memory 0.00>, request <cpu 0.00, memory 0.00>, elastic <cpu 0.00, memory 0.00>, share <0.00>
1 proportion.go:242] Remaining resource is <cpu 2703200.00, memory 41159590465010.00, nvidia.com/gpu 320000.00, nvidia.com/hostdev_2 40000.00, pods 4620.00, nvidia.com/hostdev_1 200000.00, ephemeral-storage 20403247824896000.00, hugepages-1Gi 0.00, hugepages-2Mi 0.00>
1 proportion.go:244] Exiting when remaining is empty or no queue has more resource request: <cpu 2703200.00, memory 41159590465010.00, hugepages-2Mi 0.00, nvidia.com/gpu 320000.00, nvidia.com/hostdev_2 40000.00, pods 4620.00, nvidia.com/hostdev_1 200000.00, ephemeral-storage 20403247824896000.00, hugepages-1Gi 0.00>
1 nodegroup.go:217] queueGroupAffinity queueGroupAntiAffinityRequired <map[]> queueGroupAntiAffinityPreferred <map[]> queueGroupAffinityRequired <map[a800:map[a800:{}] a8001:map[a8001:{}]]> queueGroupAffinityPreferred <map[]> groupLabelName <volcano.sh/nodegroup-name>
1 binpack.go:165] Enter binpack plugin ...
1 binpack.go:183] resources [] record in weight but not found on any node
1 binpack.go:167] Leaving binpack plugin. binpack.weight[10], binpack.cpu[1], binpack.memory[1], nvidia.com/gpu[8], cpu[1], memory[1] ...
1 enqueue.go:45] Enter Enqueue ...
1 enqueue.go:63] Added Queue for Job <r1/job-1-47412291-c368-4e58-ae68-4fcb9158cbec>
1 enqueue.go:74] Added Job <r1/job-1-47412291-c368-4e58-ae68-4fcb9158cbec> into Queue
1 enqueue.go:79] Try to enqueue PodGroup to 1 Queues
1 overcommit.go:123] Sufficient resources, permit job <r1/job-1-47412291-c368-4e58-ae68-4fcb9158cbec> to be inqueue
1 proportion.go:336] job job-1-47412291-c368-4e58-ae68-4fcb9158cbec min resource <cpu 4000.00, memory 4294967296.00, nvidia.com/gpu 2000.00, pods 1.00>, queue a800 capability <cpu 2703200.00, memory 41159590465010.00, ephemeral-storage 0.00, hugepages-1Gi 0.00, hugepages-2Mi 0.00, nvidia.com/gpu 5000.00, nvidia.com/hostdev_2 0.00, pods 0.00, nvidia.com/hostdev_1 0.00> allocated <cpu 0.00, memory 0.00> inqueue <cpu 0.00, memory 0.00> elastic <cpu 0.00, memory 0.00>
1 proportion.go:349] job job-1-47412291-c368-4e58-ae68-4fcb9158cbec inqueue false
1 enqueue.go:104] Leaving Enqueue ...
1 allocate.go:47] Enter Allocate ...
1 allocate.go:74] Job <r1/job-1-47412291-c368-4e58-ae68-4fcb9158cbec> Queue skip allocate, reason: job status is pending.
1 allocate.go:64] Try to allocate resource to 0 Queues

lowang-bh · 2024-04-25T14:05:06Z

proportion.go:336] job job-1-47412291-c368-4e58-ae68-4fcb9158cbec min resource <cpu 4000.00, memory 4294967296.00, nvidia.com/gpu 2000.00, pods 1.00>, queue a800 capability <cpu 2703200.00, memory 41159590465010.00, ephemeral-storage 0.00, hugepages-1Gi 0.00, hugepages-2Mi 0.00, nvidia.com/gpu 5000.00, nvidia.com/hostdev_2 0.00, pods 0.00, nvidia.com/hostdev_1 0.00> allocated <cpu 0.00, memory 0.00> inqueue <cpu 0.00, memory 0.00> elastic <cpu 0.00, memory 0.00>

Your queue's capacity of pods is 0 and cannot enqueue job.

		klog.V(5).Infof("job %s min resource <%s>, queue %s capability <%s> allocated <%s> inqueue <%s> elastic <%s>",
			job.Name, minReq.String(), queue.Name, attr.realCapability.String(), attr.allocated.String(), attr.inqueue.String(), attr.elastic.String())
		// The queue resource quota limit has not reached
		r := minReq.Add(attr.allocated).Add(attr.inqueue).Sub(attr.elastic)
		rr := attr.realCapability.Clone()

		for name := range rr.ScalarResources {
			if _, ok := r.ScalarResources[name]; !ok {
				delete(rr.ScalarResources, name)
			}
		}

		inqueue := r.LessEqual(rr, api.Infinity)
		klog.V(5).Infof("job %s inqueue %v", job.Name, inqueue)

ffz12 · 2024-04-26T01:56:13Z

How do I do this?

ffz12 · 2024-04-29T01:39:36Z

This is the normal way to write a job

cat a800.yaml
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: a800
spec:
    reclaimable: true
    weight: 1
    capability:
       nvidia.com/gpu: "5"
       pods: 200    #pods数必须填写
    affinity:            # added field
      nodeGroupAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - a800

lowang-bh · 2024-04-29T05:14:32Z

Because #3188 add the pods as a kind of extend resource to support preempt.
@Monokaix I think we'd better to concluse a change log to declare those changes will influence end users when publish a release note.

Monokaix · 2024-04-29T06:27:54Z

Can v1.8.2 solve your problem?

Monokaix · 2024-04-29T06:32:01Z

Because #3188 add the pods as a kind of extend resource to support preempt. @Monokaix I think we'd better to concluse a change log to declare those changes will influence end users when publish a release note.

I think this can be solved after #3216 merged.

Monokaix · 2024-05-09T01:26:22Z

/close

volcano-sh-bot · 2024-05-09T01:26:26Z

@Monokaix: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ffz12 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 19, 2024

volcano-sh-bot closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the latest version, the job cannot run because the gpu quota is set in the queue #3426

In the latest version, the job cannot run because the gpu quota is set in the queue #3426

ffz12 commented Apr 19, 2024 •

edited

lowang-bh commented Apr 19, 2024

Monokaix commented Apr 19, 2024

ffz12 commented Apr 19, 2024

lowang-bh commented Apr 21, 2024

ffz12 commented Apr 22, 2024

ffz12 commented Apr 23, 2024 •

edited

lowang-bh commented Apr 24, 2024

Monokaix commented Apr 24, 2024

ffz12 commented Apr 25, 2024

ffz12 commented Apr 25, 2024

lowang-bh commented Apr 25, 2024

ffz12 commented Apr 26, 2024

ffz12 commented Apr 29, 2024

lowang-bh commented Apr 29, 2024 •

edited

Monokaix commented Apr 29, 2024

Monokaix commented Apr 29, 2024

Monokaix commented May 9, 2024

volcano-sh-bot commented May 9, 2024

In the latest version, the job cannot run because the gpu quota is set in the queue #3426

In the latest version, the job cannot run because the gpu quota is set in the queue #3426

Comments

ffz12 commented Apr 19, 2024 • edited

lowang-bh commented Apr 19, 2024

Monokaix commented Apr 19, 2024

ffz12 commented Apr 19, 2024

lowang-bh commented Apr 21, 2024

ffz12 commented Apr 22, 2024

ffz12 commented Apr 23, 2024 • edited

lowang-bh commented Apr 24, 2024

Monokaix commented Apr 24, 2024

ffz12 commented Apr 25, 2024

ffz12 commented Apr 25, 2024

lowang-bh commented Apr 25, 2024

ffz12 commented Apr 26, 2024

ffz12 commented Apr 29, 2024

lowang-bh commented Apr 29, 2024 • edited

Monokaix commented Apr 29, 2024

Monokaix commented Apr 29, 2024

Monokaix commented May 9, 2024

volcano-sh-bot commented May 9, 2024

ffz12 commented Apr 19, 2024 •

edited

ffz12 commented Apr 23, 2024 •

edited

lowang-bh commented Apr 29, 2024 •

edited