Kubelet synchronizes pod states in a parallel manner. #124596

zhifei92 · 2024-04-28T14:11:00Z

What type of PR is this?

/sig node
/area kubelet
/kind cleanup

What this PR does / why we need it:

When a large number of pods are created or deleted on a node, parallelly calling syncPod within syncBatch can effectively boost the performance of pod status synchronization.Addressing the issue mentioned in #123875, #123877

unit test

Test Scenario: When syncpod has a 20ms latency with 100 pods updating their status concurrently.

Below are the test results without optimization

% make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-run ^TestManyPodsUpdates$'
+++ [0428 18:37:36] Set GOMAXPROCS automatically to 12
+++ [0428 18:37:36] Running tests without code coverage and with -race
=== RUN   TestManyPodsUpdates
--- PASS: TestManyPodsUpdates (2.22s)
PASS
ok      k8s.io/kubernetes/pkg/kubelet/status    4.143s

the test results after optimization

% make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-run ^TestManyPodsUpdates$'
+++ [0428 18:38:03] Set GOMAXPROCS automatically to 12
+++ [0428 18:38:04] Running tests without code coverage and with -race
=== RUN   TestManyPodsUpdates
--- PASS: TestManyPodsUpdates (0.09s)
PASS
ok      k8s.io/kubernetes/pkg/kubelet/status    2.010s

e2e test

spec latency/resource should be within limit when create 90 pods with 0s interval
env :virtualbox Ubuntu 22.04 4 cores 8GB mem on maxOS Intel Core i7

Below are the test results without optimization

{
  "version": "v2",
  "dataItems": [
    {
      "data": {
        "Perc100": 39425.357059,
        "Perc50": 36094.848409,
        "Perc90": 39269.009005,
        "Perc99": 39425.357059
      },
      "unit": "ms",
      "labels": {
        "datatype": "latency",
        "latencytype": "create-pod"
      }
    }
  ],
  "labels": {
    "desc": "latency/resource should be within limit when create 90 pods with 0s interval [Benchmark][NodeSpecialFeature:Benchmark]",
    "image": "Ubuntu 22.04.4 LTS",
    "machine": "cpu:4core,memory:7.8GB",
    "node": "ubuntu3",
    "test": "density_create_batch_90_0_0_5",
    "timestamp": "1715749372"
  }
}

the test results after optimization

{
  "version": "v2",
  "dataItems": [
    {
      "data": {
        "Perc100": 28667.124507,
        "Perc50": 25658.057549,
        "Perc90": 28422.753707,
        "Perc99": 28667.124507
      },
      "unit": "ms",
      "labels": {
        "datatype": "latency",
        "latencytype": "create-pod"
      }
    }
  ],
  "labels": {
    "desc": "latency/resource should be within limit when create 90 pods with 0s interval [Benchmark][NodeSpecialFeature:Benchmark]",
    "image": "Ubuntu 22.04.4 LTS",
    "machine": "cpu:4core,memory:7.8GB",
    "node": "ubuntu3",
    "test": "density_create_batch_90_0_0_5",
    "timestamp": "1715750021"
  }
}

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2024-04-28T14:11:02Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2024-04-28T14:11:07Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2024-04-28T14:11:09Z

Hi @zhifei92. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2024-04-28T14:11:55Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zhifei92
Once this PR has been reviewed and has the lgtm label, please assign mrunalp for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

pkg/kubelet/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

zhifei92 · 2024-05-07T03:47:15Z

/assign @dims @rphillips Please help me review this. thks

bart0sh · 2024-05-08T11:12:07Z

pkg/kubelet/status/status_manager_test.go

@@ -436,6 +459,19 @@ func TestStaleUpdates(t *testing.T) {
 	verifyUpdates(t, m, 0)
 }

+func TestManyPodsUpdates(t *testing.T) {
+	const podNums = 100


@zhifei92 FWIW In my environment(Ubuntu 22.04 on Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz 24 cores) the change slows down the test:
master:

kubernetes (master) $ make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-count 1 -run ^TestManyPodsUpdates$' +++ [0508 14:06:19] Set GOMAXPROCS automatically to 24 +++ [0508 14:06:19] Running tests without code coverage and with -race === RUN TestManyPodsUpdates --- PASS: TestManyPodsUpdates (0.05s) PASS ok k8s.io/kubernetes/pkg/kubelet/status 1.113s

this PR:

kubernetes (124596) $ make test WHAT=./pkg/kubelet/status GOFLAGS="-v" KUBE_TEST_ARGS='-count 1 -run ^TestManyPodsUpdates$' +++ [0508 14:03:41] Set GOMAXPROCS automatically to 24 +++ [0508 14:03:42] Running tests without code coverage and with -race === RUN TestManyPodsUpdates --- PASS: TestManyPodsUpdates (0.13s) PASS ok k8s.io/kubernetes/pkg/kubelet/status 1.231s

"Yes, that's correct, especially if we're not simulating any latency scenarios, where the extra overhead of locks does become a factor. However, considering there is inherent network latency in the communication between the kubelet and the kube-apiserver, we need to emulate a delay of 10 to 20 milliseconds within the syncPod function to truly demonstrate the value of parallel processing."

It looks like provided unit test is not the best way to demonstrate benefits of the change. How about providing e2e test?

The e2e test should be better. I'll try it

@bart0sh Using the spec latency/resource should be within limit when create 90 pods with 0s interval should achieve the objective; if feasible, consider employing tc tc qdisc add dev lo root netem delay 20ms to simulate network latency.

I believe it would be better to have it reproducible in an e2e (not e2e_node) test case(s). This would allow us to avoid simulation and see a real life delays.

Sorry, I had mistakenly believed that an e2e_node test case was needed. I need to carefully contemplate how to cover this scenario using an end-to-end (e2e) test case. As I am not yet well-versed in e2e testing, it might take me some additional time to figure this out.

HirazawaUi · 2024-05-15T16:09:25Z

pkg/kubelet/status/status_manager.go

-		m.syncPod(update.podUID, update.status)
+		go func() {
+			defer wg.Done()
+			m.syncPod(update.podUID, update.status)


syncPod will cause multiple request to Apiserver, but kubelet's request to Apiserver is limited by kube-api-qps) and max-requests-inflight.

And if there is a large cluster, multiple pods are created at the same time, and the kubelet of multiple nodes runs syncPod in parallel at the same time, it will undoubtedly trigger throttling and affect the normal work of Apiserver.

This PR will bring unpredictable risks, I may prefer #123877

And if there is a large cluster, multiple pods are created at the same time, and the kubelet of multiple nodes runs syncPod in parallel at the same time, it will undoubtedly trigger throttling and affect the normal work of Apiserver.

This PR will bring unpredictable risks, I may prefer #123877

Thanks for your comment. The impact of GET operations on the API server and etcd is considerably less than that of LIST(list all pods). Coupled with the apiserver APF, QPS limits, and the Kubelet QPS limits, this should suffice for the majority of scenarios.

Kubelet synchronizes pod states in a parallel manner.

291141a

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 28, 2024

k8s-ci-robot requested review from Random-Liu and rphillips April 28, 2024 14:12

bart0sh added this to Triage in SIG Node PR Triage Apr 29, 2024

bart0sh reviewed May 8, 2024

View reviewed changes

bart0sh moved this from Triage to Waiting on Author in SIG Node PR Triage May 15, 2024

HirazawaUi reviewed May 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet synchronizes pod states in a parallel manner. #124596

Kubelet synchronizes pod states in a parallel manner. #124596

zhifei92 commented Apr 28, 2024 •

edited

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

zhifei92 commented May 7, 2024

bart0sh May 8, 2024

zhifei92 May 9, 2024

bart0sh May 9, 2024

zhifei92 May 10, 2024

zhifei92 May 15, 2024

bart0sh May 15, 2024

zhifei92 May 16, 2024

HirazawaUi May 15, 2024 •

edited

zhifei92 May 16, 2024

Kubelet synchronizes pod states in a parallel manner. #124596

Are you sure you want to change the base?

Kubelet synchronizes pod states in a parallel manner. #124596

Conversation

zhifei92 commented Apr 28, 2024 • edited

What type of PR is this?

What this PR does / why we need it:

unit test

e2e test

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

k8s-ci-robot commented Apr 28, 2024

zhifei92 commented May 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HirazawaUi May 15, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhifei92 commented Apr 28, 2024 •

edited

HirazawaUi May 15, 2024 •

edited