Gardener fails to create shoots with > ~80 worker pools #9545
Labels
area/ipcei
IPCEI (Important Project of Common European Interest)
area/scalability
Scalability related
kind/bug
Bug
kind/epic
Large multi-story topic
How to categorize this issue?
/area scalability
/kind bug
What happened:
When attempting to create a shoot with over approximately 80 nodepools, the shoot becomes stuck in the
Create Processing
state. The shoot generates an error message stating:Flow "Shoot cluster reconciliation" encountered task errors: [task "Configuring shoot worker pools" failed: retry failed with context deadline exceeded, last error: etcdserver: request is too large] Operation will be retried.
Upon investigation, it was discovered that the
Worker
resource, which is created for the shoot, becomes excessively large due to eachWorkerPool
containing userData necessary for machine bootstrap. This exceeds the etcd's max-request-bytes limit of 1.5MiB for the worker resource.What you expected to happen:
The shoot should be successfully created.
How to reproduce it (as minimally and precisely as possible):
Create shoots with a large number (80-90+) of nodepools.
Proposed Solution:
Suggest replacing the userData field in the WorkerPool type with a secretReference. This approach aligns with the OperatingSystemConfig resource, which already stores its cloud_config in a secret. Refer to the OperatingSystemConfig documentation for more details.
Tasks:
.spec.pools[].userDataSecretRef
to prevent inlining the entire user data #9722Worker
extensions (afterv1.95
has been released)provider-alicloud
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-alicloud#727provider-aws
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-aws#961provider-azure
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-azure#868provider-gcp
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-gcp#767provider-openstack
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-openstack#776provider-equinix-metal
: Read new.spec.pools[].userDataSecretRef
field for worker pool user data gardener-extension-provider-equinix-metal#314gardener/gardener@v1.100
has been released: Drop deprecatedUserData
field fromextensions.gardener.cloud/v1alpha1.Worker
resourceEnvironment:
kubectl version
): v1.26.14The text was updated successfully, but these errors were encountered: