Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for the Job managedBy field (alpha) #123273

Merged
merged 8 commits into from Mar 5, 2024

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Feb 13, 2024

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

In order to support a mechanism allowing to delegate reconciliation of a Job to an external controller. See more details in the KEP.

Which issue(s) this PR fixes:

Part of enhancement tracking issue: kubernetes/enhancements#4368

Special notes for your reviewer:

After discussion with API reviewers we decided to use field instead of label, see here.

This means the feature will go into Alpha for 1.30. Here is the relevant KEP update.

Does this PR introduce a user-facing change?

Added (alpha) support for the managedBy field on Jobs. Jobs with a custom value of this field - any
value other than `kubernetes.io/job-controller` - are skipped by the job controller, and their
reconciliation is delegated to an external controller, indicated by the value of the field. Jobs that
don't have this field at all, or where the field value is the reserved string `kubernetes.io/job-controller`,
are reconciled by the built-in job controller.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/4368-support-managed-by-for-batch-jobs

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/test labels Feb 13, 2024
@k8s-ci-robot k8s-ci-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/apps Categorizes an issue or PR as relevant to SIG Apps. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/testing Categorizes an issue or PR as relevant to SIG Testing. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 13, 2024
@mimowo mimowo changed the title WIP; support for the Job managed-by label WIP: support for the Job managed-by label Feb 13, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Feb 13, 2024

/cc @alculquicondor

@sftim
Copy link
Contributor

sftim commented Feb 13, 2024

Changelog suggestion

-Add support for the Job managed-by label in Beta
+Added more labelling for Pods that belong to a Job. The Pods are labelled with the key `batch.kubernetes.io/managed-by`
+provided that the feature gate `JobManagedByLabel` is enabled for the cluster. The label value is `job-controller.k8s.io`
+for Pods managed by a Job.

(if that's not right, please correct the text to tell the actual story)

test/e2e/apps/job.go Outdated Show resolved Hide resolved
@mimowo mimowo force-pushed the job-managed-by-impl branch 3 times, most recently from 9937612 to ef74f41 Compare February 14, 2024 10:00
@mimowo
Copy link
Contributor Author

mimowo commented Feb 14, 2024

Changelog suggestion

-Add support for the Job managed-by label in Beta
+Added more labelling for Pods that belong to a Job. The Pods are labelled with the key `batch.kubernetes.io/managed-by`
+provided that the feature gate `JobManagedByLabel` is enabled for the cluster. The label value is `job-controller.k8s.io`
+for Pods managed by a Job.

(if that's not right, please correct the text to tell the actual story)

Thanks for pointing out there is a need to clarify this. Let me know if the new note makes it clear, I'm happy to adjust / clarify more.

@mimowo mimowo force-pushed the job-managed-by-impl branch 2 times, most recently from 1b0ef21 to 9497308 Compare February 14, 2024 13:06
@sftim
Copy link
Contributor

sftim commented Feb 14, 2024

Further changelog suggestion

-Add support for the Job's "batch.kubernetes.io/managed-by" label in Beta. Jobs with a custom value of this
-label (other than "job-controller.k8s.io") are skipped by the Job controller, and their reconciliation is
+Added (beta) support for the `batch.kubernetes.io/managed-by` label on Jobs. Jobs with a custom value of this
+label - any label value other than `job-controller.k8s.io` - are skipped by the job controller, and their reconciliation is
-delegated to an external controller, indicated by the value of the label. Jobs without this label, or with 
-the reserved "job-controller.k8s.io" as value, are reconciled by the built-in Job controller.
+delegated to an external controller, indicated by the value of the label. Jobs that don't have this label at all,
+or where the label value is the reserved string `job-controller.k8s.io` as value, are reconciled by the built-in
+job controller.

This is [a fragment of] Markdown, so use backticks rather than double quotes. Also no capitals in “job controller”.

@mimowo
Copy link
Contributor Author

mimowo commented Feb 15, 2024

Further changelog suggestion

lgtm, thanks, applied

@sftim
Copy link
Contributor

sftim commented Feb 15, 2024

D'oh, my error; one more suggested update.

 Added (beta) support for the `batch.kubernetes.io/managed-by` label on Jobs. Jobs with a custom value of this
 label - any label value other than `job-controller.k8s.io` - are skipped by the job controller, and their reconciliation is
 delegated to an external controller, indicated by the value of the label. Jobs that don't have this label at all,
-or where the label value is the reserved string `job-controller.k8s.io` as value, are reconciled by the built-in
+or where the label value is the reserved string `job-controller.k8s.io`, are reconciled by the built-in
 job controller.

@mimowo
Copy link
Contributor Author

mimowo commented Feb 15, 2024

The failure in the TestNonParallelJob integration test looks like unrelated flake, and I got this failure locally on master as well. I will investigate.

EDIT: I see what is going on, fix: #123321. I saw it failing locally because I was looping the test.

@mimowo
Copy link
Contributor Author

mimowo commented Mar 4, 2024

/retest

Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve cancel
to skip updating the pod metrics for finalized pods.

pkg/registry/batch/job/strategy.go Show resolved Hide resolved
@mimowo
Copy link
Contributor Author

mimowo commented Mar 4, 2024

/approve cancel
to skip updating the pod metrics for finalized pods.

Same response as here: #123273 (comment). I'm afraid that making the distinction is likely to introduce bugs. For example, the metic is guage. We may not increment because there is a custom managedBy, but then we would decrement the metric on pod removal (becuase there is already no parent, so we cannot check). This could make the metric skewed, like getting too early to 0.

@mimowo
Copy link
Contributor Author

mimowo commented Mar 4, 2024

/retest

@alculquicondor
Copy link
Member

/approve
/hold
for soltysh

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 4, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Mar 4, 2024

/retest

pkg/controller/job/job_controller.go Outdated Show resolved Hide resolved
pkg/controller/job/job_controller.go Outdated Show resolved Hide resolved
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 5, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 5, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a1e9e235b583a9b058118835617e84fd5b6d06d4

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, deads2k, mimowo, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@soltysh
Copy link
Contributor

soltysh commented Mar 5, 2024

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 5, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Mar 5, 2024

/test pull-kubernetes-e2e-kind-ipv6
unrelated failure

@k8s-ci-robot k8s-ci-robot merged commit e568a77 into kubernetes:master Mar 5, 2024
16 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.30 milestone Mar 5, 2024
Jeffwan pushed a commit to Jeffwan/kubernetes that referenced this pull request Mar 6, 2024
* support for the managed-by label in Job

* Use managedBy field instead of managed-by label

* Additional review remarks

* Review remarks 2

* review remarks 3

* Skip cleanup of finalizers for job with custom managedBy

* Drop the performance optimization

* imrpove logs
dinhxuanvu pushed a commit to dinhxuanvu/kubernetes that referenced this pull request Mar 28, 2024
* support for the managed-by label in Job

* Use managedBy field instead of managed-by label

* Additional review remarks

* Review remarks 2

* review remarks 3

* Skip cleanup of finalizers for job with custom managedBy

* Drop the performance optimization

* imrpove logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.30
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet