Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-4368: Job API managed-by label #4370

Merged
merged 51 commits into from Feb 5, 2024

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Dec 20, 2023

  • One-line PR description: Proposes Job API label for Multi-Kueue
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 20, 2023
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Dec 20, 2023
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Dec 20, 2023
@mimowo mimowo mentioned this pull request Dec 20, 2023
8 tasks
@mimowo mimowo force-pushed the job-managed-by-label branch 4 times, most recently from 8ed11fd to 93eb6dc Compare December 20, 2023 18:15

Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].

#### Alpha

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason or some background as to why this is going straight to beta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base-functionality implementation is very simple, just short-circuit the syncJob invocation. Also, I think going via Alpha is a way to allow for version skew between kube-apiservers in HA setup. To ensure that all apiserver instances recognize the new field. However, since this is a label, there is no such need technically.

Finally, we would like to have the feature soak without unnecessary delay.

I have added a note along the lines. PTAL.

@mimowo mimowo force-pushed the job-managed-by-label branch 2 times, most recently from ab35059 to c24755c Compare December 20, 2023 18:37

In the [Beta](#beta) iteration of the feature we skip synchronization of the
Jobs with the `managed-by` label, if it has any different value than `job-controller`.
We also default the label for all newly created jobs, to avoid implicit defaults.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about requiring the label on all jobs. I was expecting the normal default controller to keep working as-is for jobs without the label, and to switch to excluding jobs with the label, without needing to decorate or modify any existing jobs. That would mean that only exceptional jobs that wanted to be managed by a non-default controller would opt into the label.

We also default the label for all newly created jobs

Does this mean 1) we'd set a label on a job created by a user, or 2) our controllers that create jobs (e.g. cronjob controller) would populate a label? If 1, is there precedence for that? What component would default that and how would it know what job controller name the "default" controller was running as?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about requiring the label on all jobs. I was expecting the normal default controller to keep working as-is for jobs without the label, and to switch to excluding jobs with the label, without needing to decorate or modify any existing jobs. That would mean that only exceptional jobs that wanted to be managed by a non-default controller would opt into the label.

Makes sense to me. This requirement complicates defaulting and would prolong graduation. Let's consider how it can be avoided.

One question is if the built-in controller should handle jobs with managed-by: job-controller? I assume yes, by analogy to prior work on managed-by label.

So, if we do with simple ifs to skip synchronization, then it is simple. If we want to use label selectors for jobs, then we can probably do it with two informers, one for no label, and one for job-controller value.

I will adjust the KEP assuming we don't require it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to change the fact that we're using shared-informers. If that is true, then simple "if skip" is the only path forward.

...
// LabelManagedBy is used to indicate the controller or entity that manages
// an Job.
LabelManagedBy = "batch.kubernetes.io/managed-by"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplicity of the surface area here. Some benefits of the label approach over a field:

  • it generalizes to custom resources as well
  • it could consistently live in the same place in multiple objects (all objects have metadata.labels)

The main downside I see is that there's no signal in openapi whether the cluster is new enough to be aware of the controller subdivision. That's largely a problem time would cure.

I wonder if it would make sense to teach all the core workload controllers a generic version of this (kubernetes.io/managed-by) on their parent objects. That would enable use of multiple controllers on the same type, with the default (no managed-by label) meaning "use the default controller for this type".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

The main downside I see is that there's no signal in openapi whether the cluster is new enough to be aware of the controller subdivision. That's largely a problem time would cure.

Indeed, I added a note on the related risk in the Two controllers running at the same time on old version section

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to teach all the core workload controllers a generic version of this (kubernetes.io/managed-by) on their parent objects. That would enable use of multiple controllers on the same type, with the default (no managed-by label) meaning "use the default controller for this type".

@atiratree this ^ would probably be a good goal for sig-apps long term to address.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an interesting idea. I suppose we could consider it, if there were clear benefits and a clearly defined process and contract for the ownership.

The downsides mentioned in this KEP are still quite significant to support this in all controllers. I think CRDs are still preferable for extending workloads like Deployments, StatefulSets, etc.

I know we have conformance tests, but that doesn't stop people from forking the controllers and implementing incompatible behavior. And API validation has its limits. This would then be problematic for the consumers of these workloads who would either be unprepared and unable to react in a safe and bug-free manner. And if they could, they would need extra logic and switch the behavior according to the current controller. They would also run into a similar problem if they found a controller they do not support.

As we see for example with StatefulSets and their need for migration. I expect the same need to arise for these controllers. E.g. I want to migrate from a controller A to a controller B for a new shiny functionality. So sooner or later we might make the managed label/field mutable. This would probably require some kind of locking, so that two controller would not operate on the same object. Even in mid change. And to consider rogue controllers.

The benefit for Jobs is that it is used as a building block in many projects, and we do not expect people to fork the controller. Not sure if this applies elsewhere.

@mimowo can we add the following to the GA graduation criteria?

Asses the fragmentation of the ecosystem. Look for other implementations of a job controller and asses their conformance with k8s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, added the point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the generic kubernetes.io/managed-by label. I would like to defer it for this KEP.

There is currently no clear use case to support the label across the stack for APIs like StatefulSets, Deployments, or DaemonSets.

A generic name without support across all k8s APIs might be confusing to the users, and supporting it for all k8s APIs would be much bigger effort than
currently needed for the MultiKueue scenario use.

The "managed-by" label idea has significant risks, such as "ecosystem fragmentation due to forks".
It makes sense to start with limited scope as a "pilot" and assess the impact.

I put the proposal, along with the reasons to defer in the Alternatives section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the generic kubernetes.io/managed-by label. I would like to defer it for this KEP.

Generic is definitely outside the scope of this KEP.

Comment on lines 382 to 383
In the [Beta](#beta) iteration of the feature we skip synchronization of the
Jobs with the `managed-by` label, if it has any different value than `job-controller`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question is what happens if the label is mutated on an existing job? From the point of view of a controller filtering on a label selector, that will result in a "delete" event if the label mutates to no longer match its selector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that all our controllers are using SharedInformers anyway, I don't think that setting a selector is even possible. So the only thing we can effectively handle differently events where the label is set and that means that we will effectively be handling update in those cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how label selectors would work so deferred as a potential thing to reconsider for second Beta.

However, since we use SharedInformers we would keep all job objects in memory anyway. This greatly undermines the complicated idea of label selectors. I will update the KEP to just stay with the skip if within the syncJob function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I give up on the idea of using label selectors, added them to the alternatives section, along with the reason why. The main point is that we cannot really use them to make memory gains due to used shared-informers. They are also much more complex conceptually.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 21, 2023
@mimowo mimowo force-pushed the job-managed-by-label branch 3 times, most recently from 0f68654 to b0477b3 Compare December 21, 2023 12:09
@wojtek-t wojtek-t self-assigned this Dec 21, 2023
Comment on lines 382 to 383
In the [Beta](#beta) iteration of the feature we skip synchronization of the
Jobs with the `managed-by` label, if it has any different value than `job-controller`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that all our controllers are using SharedInformers anyway, I don't think that setting a selector is even possible. So the only thing we can effectively handle differently events where the label is set and that means that we will effectively be handling update in those cases.

@mimowo mimowo force-pushed the job-managed-by-label branch 2 times, most recently from d2a6141 to 495866f Compare December 21, 2023 15:34
@soltysh
Copy link
Contributor

soltysh commented Jan 26, 2024

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 26, 2024
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits, but overall
/lgtm
/approve

In particular, the CronJob relies on the assumption that the `CompletionTime`
is only set to successful jobs. Also, that a job does not flip its state from
Complete to Failed (or the other way round). Also, a finished job should not
flip back to non-finished.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong 👍, if we're planning to allow external controllers update the status, we need to ensure that job won't start flapping between states.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2024
…DME.md

Co-authored-by: Maciej Szulik <soltysh@gmail.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Jan 26, 2024

/assign @wojtek-t
For PRR review

@soltysh
Copy link
Contributor

soltysh commented Jan 26, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2024
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two very small comments - other than that LGTM.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2024
@wojtek-t
Copy link
Member

wojtek-t commented Feb 1, 2024

/approve PRR

[will lgtm after fixing the nit]

@mimowo
Copy link
Contributor Author

mimowo commented Feb 1, 2024

[will lgtm after fixing the nit]

done, also updated the checkpoints, PTAL (09321d0)

@wojtek-t
Copy link
Member

wojtek-t commented Feb 1, 2024

/lgtm

@soltysh - for SIG-level approval [you only lgtm-ed it]

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2024
@soltysh
Copy link
Contributor

soltysh commented Feb 5, 2024

@soltysh - for SIG-level approval [you only lgtm-ed it]

That's weird, I did that in #4370 (review) 🤷‍♂️ oh well, let me do that again.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, soltysh, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 5, 2024
@k8s-ci-robot k8s-ci-robot merged commit 95b875b into kubernetes:master Feb 5, 2024
4 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.30 milestone Feb 5, 2024
@mimowo mimowo deleted the job-managed-by-label branch February 10, 2024 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet