Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequeueState isn't always reset #1821

Closed
alculquicondor opened this issue Mar 11, 2024 · 6 comments · Fixed by #1838
Closed

RequeueState isn't always reset #1821

alculquicondor opened this issue Mar 11, 2024 · 6 comments · Fixed by #1838
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@alculquicondor
Copy link
Contributor

What happened:

If you introduce g.Expect(workload.Status.RequeueState).NotTo(gomega.BeNil()) right after this line:

g.Expect(k8sClient.Get(ctx, client.ObjectKeyFromObject(workload), workload)).Should(gomega.Succeed())

You will see that the object already lost the RequeueState.
This is because the workload reconciler sometimes sees the old object and produces an SSA update that still has the original object with no requeueState.

I think the webhook defaulter only applies on Create and not on Update.

We should probably reset the requeueState in the reconciler itself, but I'm not sure at which point. Unless the webhook can be configured to be called during Update?

What you expected to happen:

RequeueState to be reset

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

I discovered this in #1820, when I tried to make the reconciler only do updates when the Admission condition changes.

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@alculquicondor alculquicondor added the kind/bug Categorizes issue or PR as related to a bug. label Mar 11, 2024
@alculquicondor
Copy link
Contributor Author

/assign @tenzen-y

@tenzen-y
Copy link
Member

Unless the webhook can be configured to be called during Update?

@alculquicondor We have already configure like this:

- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: system
path: /mutate-kueue-x-k8s-io-v1beta1-workload
failurePolicy: Fail
name: mworkload.kb.io
rules:
- apiGroups:
- kueue.x-k8s.io
apiVersions:
- v1beta1
operations:
- CREATE
- UPDATE
resources:
- workloads
- workloads/status
sideEffects: None

@tenzen-y
Copy link
Member

So, maybe it seems not to work 🧐

@alculquicondor
Copy link
Contributor Author

It looks like the defaulting does work during Updates. However, webhooks cannot modify status on spec updates, or vice-versa.

So in the test API call that updates the Active field, the status.requeueState doesn't change.

But the next reconcile runs into this Status Update call:

err := workload.ApplyAdmissionStatus(ctx, r.client, &wl, true)

And at that point the webhook clears the requeueState.

So I think the correct action is to let the reconciler do the change, but on its dedicated API call.

@tenzen-y
Copy link
Member

It looks like the defaulting does work during Updates. However, webhooks cannot modify status on spec updates, or vice-versa.

I agree with this. Adding a test would be better.
Maybe, we can have a webhook test without launching controllers, but I'm not sure if that test is worth it.

@tenzen-y
Copy link
Member

So I think the correct action is to let the reconciler do the change, but on its dedicated API call.

That sounds reasonable. Let me try it.
Thanks for sharing the results of the investigations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants