New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GEP-20] Update the existing recovery mechanisms in the proposal #6732
Conversation
d3ce0a3
to
7424e05
Compare
/assign @timuthy @unmarshall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the improvements. Just two suggestions from my side.
07d8521
to
9c6b24f
Compare
@timuthy Can we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more points I noticed after the changes.
b2a0538
to
b699a3f
Compare
@ialidzhikov: The following test failed, say
Full PR test history. Your PR dashboard. Command help for this repository. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
d00cebe
to
80c42e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for all the efforts and clarification about this topic @ialidzhikov and @himanshu-kun ❤️
/lgtm
80c42e2
to
23a8571
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: timuthy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
How to categorize this PR?
/area high-availability
/kind enhancement
What this PR does / why we need it:
Previously there was the misunderstanding that in case of a Node/zone outage Pods on unhealthy Nodes hang forever in
Terminating
state and that we have to enhance/implement garbage collection logic for this case.This PR updates the existing recovery mechanisms and describes the observations from #6529 (comment) and #6529 (comment).
Which issue(s) this PR fixes:
Part of #6529
Release note: