New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
with preempt or reclaim plugin, the high priority pod can not be placed at some node which meet the conditions for preemption #3329
Comments
Would you please supply more informations, such as scheduler configmap, and scheduler logs and jobs config? |
when preempt or reclaim, if one predicate function handler return status with
|
What is the error? |
There is a scene, an unscheduled pod with gpu resources is in the session of volcano/pkg/scheduler/plugins/predicates/predicates.go Lines 530 to 554 in 6e9f4f6
|
It's truly a problem in vgpu preemption, I think we should not reuturn err when vgpu resource insufficient here, if you're interested, welcome to fix that. |
Same problem: #3186. We can fix it to resolve both of them. |
@LivingCcj @lowang-bh You're welcome to fix this: ) |
This phenomenon has recurred when vgpu resource is insufficient.
Vital information:device_info.go:187] deviceSharing err= not enough gpu fitted on this node |
@archlitchi is owned and familar with the vgpu code. @Monokaix |
I might experience a similar issue. |
when volcano scheduler open preempt or reclaim plugin,the high prioriry pod is unable to preempt the low priority pod. Although there are some nodes that meet the preemption conditions,beacuse one function in these predicateFns return err (is not nil), the potential node will be ignore
volcano/pkg/scheduler/actions/preempt/preempt.go
Lines 211 to 221 in 94c62a4
Environment:
kubectl version
): v1.20.15The text was updated successfully, but these errors were encountered: