Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Executor state map becoming too large can result in failures to write the CR to etcd #2025

Open
1 task done
jacobsalway opened this issue May 11, 2024 · 0 comments
Open
1 task done

Comments

@jacobsalway
Copy link
Contributor

jacobsalway commented May 11, 2024

Description

We use the operator to manage the lifecycle of both batch and Spark streaming applications. Streaming apps in particular are long lived and if using dynamic allocation may scale up and down over time, resulting in the creation of new executor IDs (see link below for incremented executor IDs inside Spark)

The operator tracks the state of each individual executor pod inside the .Status.ExecutorState field but these entries are never removed. For long lived streaming applications this map will eventually become so large that the operator the CR cannot be written back to etcd as it's over the request size limit.

https://github.com/kubeflow/spark-operator/blob/master/pkg/controller/sparkapplication/controller.go#L367-L436

https://github.com/apache/spark/blob/d82458f15539eef8df320345a7c2382ca4d5be8a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L460

  • ✋ I have searched the open/closed issues and my issue is not listed.

Reproduction Code [Required]

Create enough unique executors over the lifetime of a Spark application that eventually the CR becomes larger than the max etcd request size.

Expected behavior

The operator should not fail to write the CR back to etcd.

Actual behavior

If enough executor IDs are accumulated within a single application, eventually the operator may fail to write the CR back to etcd.

Terminal Output Screenshot(s)

I can't find any internal screenshots of how large the executor state map was but did find the etcd write failure log.

c7308920-9eb3-4f2b-80ad-a63daa24ba5a

Environment & Versions

  • Spark Operator App version: internal
  • Helm Chart Version: internal
  • Kubernetes Version: 1.26
  • Apache Spark version: 3.3-3.5

Additional context

Internally we found no one was using this field so effectively disabled the tracking. We use EKS so I cannot change the max size of an etcd request but EKS docs say it's 1.5 megabytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant