New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet synchronizes pod states in a parallel manner. #124596
base: master
Are you sure you want to change the base?
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @zhifei92. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: zhifei92 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @dims @rphillips Please help me review this. thks |
@@ -436,6 +459,19 @@ func TestStaleUpdates(t *testing.T) { | |||
verifyUpdates(t, m, 0) | |||
} | |||
|
|||
func TestManyPodsUpdates(t *testing.T) { | |||
const podNums = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhifei92 FWIW In my environment(Ubuntu 22.04 on Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz 24 cores) the change slows down the test:
master:
kubernetes (master) $ make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-count 1 -run ^TestManyPodsUpdates$'
+++ [0508 14:06:19] Set GOMAXPROCS automatically to 24
+++ [0508 14:06:19] Running tests without code coverage and with -race
=== RUN TestManyPodsUpdates
--- PASS: TestManyPodsUpdates (0.05s)
PASS
ok k8s.io/kubernetes/pkg/kubelet/status 1.113s
this PR:
kubernetes (124596) $ make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-count 1 -run ^TestManyPodsUpdates$'
+++ [0508 14:03:41] Set GOMAXPROCS automatically to 24
+++ [0508 14:03:42] Running tests without code coverage and with -race
=== RUN TestManyPodsUpdates
--- PASS: TestManyPodsUpdates (0.13s)
PASS
ok k8s.io/kubernetes/pkg/kubelet/status 1.231s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Yes, that's correct, especially if we're not simulating any latency scenarios, where the extra overhead of locks does become a factor. However, considering there is inherent network latency in the communication between the kubelet and the kube-apiserver, we need to emulate a delay of 10 to 20 milliseconds within the syncPod
function to truly demonstrate the value of parallel processing."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like provided unit test is not the best way to demonstrate benefits of the change. How about providing e2e test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The e2e test should be better. I'll try it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bart0sh Using the spec latency/resource should be within limit when create 90 pods with 0s interval
should achieve the objective; if feasible, consider employing tc tc qdisc add dev lo root netem delay 20ms
to simulate network latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would be better to have it reproducible in an e2e (not e2e_node) test case(s). This would allow us to avoid simulation and see a real life delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I had mistakenly believed that an e2e_node test case was needed. I need to carefully contemplate how to cover this scenario using an end-to-end (e2e) test case. As I am not yet well-versed in e2e testing, it might take me some additional time to figure this out.
m.syncPod(update.podUID, update.status) | ||
go func() { | ||
defer wg.Done() | ||
m.syncPod(update.podUID, update.status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syncPod
will cause multiple request to Apiserver, but kubelet's request to Apiserver is limited by kube-api-qps) and max-requests-inflight.
And if there is a large cluster, multiple pods are created at the same time, and the kubelet of multiple nodes runs syncPod
in parallel at the same time, it will undoubtedly trigger throttling and affect the normal work of Apiserver.
This PR will bring unpredictable risks, I may prefer #123877
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if there is a large cluster, multiple pods are created at the same time, and the kubelet of multiple nodes runs syncPod in parallel at the same time, it will undoubtedly trigger throttling and affect the normal work of Apiserver.
This PR will bring unpredictable risks, I may prefer #123877
Thanks for your comment. The impact of GET operations on the API server and etcd is considerably less than that of LIST(list all pods). Coupled with the apiserver APF, QPS limits, and the Kubelet QPS limits, this should suffice for the majority of scenarios.
What type of PR is this?
/sig node
/area kubelet
/kind cleanup
What this PR does / why we need it:
When a large number of pods are created or deleted on a node, parallelly calling syncPod within syncBatch can effectively boost the performance of pod status synchronization.Addressing the issue mentioned in #123875, #123877
unit test
Test Scenario: When syncpod has a 20ms latency with 100 pods updating their status concurrently.
e2e test
spec
latency/resource should be within limit when create 90 pods with 0s interval
env :virtualbox Ubuntu 22.04 4 cores 8GB mem on maxOS Intel Core i7
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: