Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PG minResource cal unreasonable,the minReq of minAvailable tasks only consider the order in yaml when the priority of tasks are same #3319

Open
henrenzhenjumin1111hao opened this issue Feb 6, 2024 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@henrenzhenjumin1111hao
Copy link

What happened:
If only have 2 GPU in cluster,pg would be pending,pod can not be created, yaml like this.

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
name: testVcjob
spec:
minAvailable: 2
schedulerName: volcano
policies:
- event: PodEvicted
action: RestartJob
queue: default
tasks:
- replicas: 1
name: worker1
template:
spec:
containers:
- image: ubuntu:latest
command: ["sh", "-c","sleep 1000000"]
name: worker1
resources:
limits:
nvidia.com/gpu: "2"
requests:
nvidia.com/gpu: "2"
restartPolicy: Never
- replicas: 2
name: worker2
template:
spec:
containers:
- image: ubuntu:latest
command: ["sh", "-c","sleep 1000000"]
name: worker2
resources:
limits:
nvidia.com/gpu: "1"
requests:
nvidia.com/gpu: "1"
restartPolicy: Never

What you expected to happen:
pg minResource should be 2 NOT 3, cluster have 2 GPU, worker2 could be scheduled, so vcjob also could running

How to reproduce it (as minimally and precisely as possible):
acoording to the yaml, could reproduce,the logic in code(job_controller_action.go) onlly in order for same pripority tasks

minReq := v1.ResourceList{}
podCnt := int32(0)
for _, task := range tasksPriority {
	for i := int32(0); i < task.Replicas; i++ {
		if podCnt >= job.Spec.MinAvailable {
			break
		}

		podCnt++
		pod := &v1.Pod{
			Spec: task.Template.Spec,
		}
		minReq = quotav1.Add(minReq, *util.GetPodQuotaUsage(pod))
	}
}

Anything else we need to know?:
no
Environment:

  • Volcano Version: 1.8.1
  • Kubernetes version (use kubectl version): 1.28.2
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@henrenzhenjumin1111hao henrenzhenjumin1111hao added the kind/bug Categorizes issue or PR as related to a bug. label Feb 6, 2024
@lowang-bh
Copy link
Member

You can set a higher priority for second worker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants