New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add readiness & liveness probes to kube-proxy #75323
Conversation
Hi @stafot. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig aws |
/ok-to-test |
A previous attempt was #50118. Now give it another thought and it seems totally reasonable to have liveness/readiness probes on kube-proxy. Would be great to add this change to below as well for consistency:
@kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-network-pr-reviews |
One problem though is that in case of kube-apiserver being not available (e.g. during master upgrade). kube-proxy may become unhealthy and gets restarted, even if that won't help. |
Thanks for the information. I 'll update accordingly. For master upgrade, this failure can cause any harm? I believe not, but if yes and is blocking issue let me know. |
3a426fe
to
22d586b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stafot
please add a release note that explains the change in one sentence instead of NONE.
also we are in code freeze until "code thaw" in 1.14 https://github.com/kubernetes/sig-release/tree/master/releases/release-1.14
/priority backlog
/assign @timothysc
@kubernetes/sig-cluster-lifecycle-pr-reviews
do we need a KEP? |
The existing liveness and readiness probes for kube-proxy are in need of adjustment. The current implementation is exec-based, which can be a resource concern, and is tied heavily to iptables, so is incompatible with ipvs. This change removes the exec-based liveness and readiness probes from the kube-proxy daemonset, and replaces them with HTTP probes of the healthz endpoint, following the direction that kubernetes seems to be taking.[0][1] The values.yaml interface to enable and disable the probes and set various parameters is also modified to use the helm-toolkit standard snippet.[2] Notably, the settings previously configurable under livenessProbe.config are now under pod.probes.proxy.proxy.liveness.params. 0: kubernetes/kubernetes#81630 1: kubernetes/kubernetes#75323 2: https://opendev.org/openstack/openstack-helm-infra/src/branch/master/helm-toolkit/templates/snippets/_kubernetes_probes.tpl Change-Id: I99ccbc2270a1f8a204417aa410868d04788dc60f
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle rotten |
Do you think this is something that we should continue to pursue @MrHohn ? |
Sorry for the delay. As @danwinship mentioned this probably needs a KEP to go forward due to the complication. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Hi all, This PR is very old - what do we want to do with it? It seems reasonable to to have a healthz handler and to use it. The handler for kube-proxy has undergone some changes and as I look at it now, it seems reasonable or even too weak. I don't think it will trigger if the apiserver is down (though maybe it should). I also don't think it will trigger if there's a chronic failure to sync rules (e.g. iptables error). Do we want to revive this? |
timeoutSeconds: 15 | ||
successThreshold: 1 | ||
failureThreshold: 2 | ||
readinessProbe: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any need for readinessProbe
matching livenessProbe
exactly?
I don't have clear what problems is solving this, at least is not my understanding from the bug referenced in the description. |
as mentioned by @MrHohn here: if that's no longer the case, i'd defer to him whether we want to proceed merging this PR. we should note that the original issue was closed without sufficient information why exactly we want to apply probes to kube-proxy: thus far i have not seen other requests about this. unless someone objects, i propose that we close this PR ~ mid-next week. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: MrHohn, ohbus, stafot, timothysc The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
closing until a KEP is written for this change. /close |
@neolit123: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
So is it safe to add readiness probe only for now? Readiness does not trigger Pod restart, but it is a good indicator of Pod health. It can help identify potential problems about kube-proxy faster. |
@SataQiu not sure. readiness should be fine, but i think we should keep the kubeadm / kubeup kubeproxy addons in sync for better test coverage. |
Possible mitigation of #75189
What type of PR is this?
/kind bug
What this PR does / why we need it:
Add readiness & liveness probes to kube-proxy daemonset example.
Which issue(s) this PR fixes:
Fixes #75189
Special notes for your reviewer:
Does this PR introduce a user-facing change?: