New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate the reason for Eviction into the pod TerminationTarget condition #2160
Conversation
Hi @pajakd. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/ok-to-test |
StopReasonNoMatchingWorkload | ||
StopReasonNotAdmitted | ||
StopReasonWorkloadDeleted StopReason = "WorkloadDeleted" | ||
StopReasonWorkloadEvicted StopReason = "WorkloadEvicted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm... this one might not be enough. We also need to know if it was WaitForPodsReady or Preemption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, a second try. Please take a look. I don't know if I should add to this enum all possible Eviction reasons or I can just directly cast and propagate the evCond.Reason
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, let's see if we can use a hyphen to concatenate the reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alculquicondor Could we decide the delimiter based on this discussion https://github.com/kubernetes/enhancements/pull/4479/files#r1581226166 since the reason should be CamelCase?
In condition types, and everywhere else they appear in the API, Reason is intended to be a one-word, CamelCase representation of the category of cause of the current status
That discussion will seem to happen at the next sig apps meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That discussion will seem to happen at the next sig apps meeting.
Oh, there is not the discussion in the agenda...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just trying to make the reason make sense.
WorkloadEvictedPreempted
doesn't read that well.
So, the alternatives would be:
WorkloadPreempted
(drop theEvicted
part)Preempted
(just use the reason directly, and forget about the Workload bit)WorkloadEvictedBecausePreempted
(same proposal as the hyphen, but with a word)
Maybe 1 reads best of all? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you since we have already the Evicted
in the condition.type
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the possible reasons for eviction are:
kueue/apis/kueue/v1beta1/workload_types.go
Lines 332 to 351 in b870144
const ( | |
// WorkloadEvictedByPreemption indicates that the workload was evicted | |
// in order to free resources for a workload with a higher priority. | |
WorkloadEvictedByPreemption = "Preempted" | |
// WorkloadEvictedByPodsReadyTimeout indicates that the eviction took | |
// place due to a PodsReady timeout. | |
WorkloadEvictedByPodsReadyTimeout = "PodsReadyTimeout" | |
// WorkloadEvictedByAdmissionCheck indicates that the workload was evicted | |
// because at least one admission check transitioned to False. | |
WorkloadEvictedByAdmissionCheck = "AdmissionCheck" | |
// WorkloadEvictedByClusterQueueStopped indicates that the workload was evicted | |
// because the ClusterQueue is Stopped. | |
WorkloadEvictedByClusterQueueStopped = "ClusterQueueStopped" | |
// WorkloadEvictedByDeactivation indicates that the workload was evicted | |
// because spec.active is set to false. | |
WorkloadEvictedByDeactivation = "InactiveWorkload" |
And 1. reads definitely good for "Preempted" but for the other ones it would be a bit confusing. In my opinion concatenation with the dash would make the most sense. But if it is not recommended in the guidelines then we can decide on any of the alternatives. Would WorkloadEvictedReason{Preempted,PodsReadyTimeout ...}
(a variant of 3) be too long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, even if it's too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, changed the joining word to "Reason".
Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: d68c24335a60cf1b5fbb031ad04c1635bb7a6b32
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, pajakd The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Whenever kueue is stopping a pod, it adds a condition TerminationTarget. However, the reason for termination is always "StoppedByKueue". The goal of this PR is to set a specific reason depending on why the pod is stopped (like workload is evicted or workload is deleted).
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?