[EKS] [request]: Ability to configure pod-eviction-timeout #159

ChrisCooney · 2019-02-10T17:36:36Z

Tell us about your request
I would like to be able to make changes to configuration values for things like kube-controller. This enables a greater customisation of the cluster to specific, bespoke needs. It will also go a long way in making the cluster more resilient and self-healing.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

At present, we have a cluster managed by EKS. The default pod-eviction-timeout is five minutes, meaning that we can derail an instance and the control plane won't reschedule for five minutes. Five minute outages for things like our payment systems is simply unacceptable - the cost impact would be severe. At present, to the best of my knowledge, the control plane is not configurable at all.

What we would like to be able to do is provide configuration parameters via the AWS API or within a Kubernetes resources like a ConfigMap. Either or would mean, when we bring up new EKS clusters, we can automate the configuration of values like pod-eviction-timeout.

Are you currently working around this issue?
No, to the best of my knowledge, it isn't something that EKS presently supports.

The text was updated successfully, but these errors were encountered:

tabern · 2019-02-15T15:53:25Z

Thanks for submitting this Chris. At present, the 5 minute timeout is the default for Kubernetes. We’re evaluating adding additional configuration parameters onto the control plane and have added this to our list of parameters to research exposing for customization on a per-cluster basis.

ChrisCooney · 2019-02-15T18:39:19Z

Hi @tabern , thanks for the response. Yes, I'm aware of the Kubernetes default. A large portion of those running K8s in production have actively tweaked these values and I worry this would be a barrier to EKS supporting some of our more critical applications.

Glad to hear this is being evaluated and look forward to seeing where it goes.

tabern · 2019-02-15T23:29:27Z

@ChrisCooney sounds good. We're going to look into this. I've updated the title of your request to specifically address this ask so we can track it.

BrianChristie · 2019-02-20T12:28:54Z

To add another use case:
We also wish to be able to adjust pod-eviction-timeout, specifically to facilitate the use of Spot Instances. In the case that an instance is terminated without the running Pods being properly evicted, we want a short timeout before those Pods are rescheduled elsewhere.

Thanks!

dawidmalina · 2019-02-20T22:54:34Z

Ideally we should be also able to tune:

--node-monitor-period
--node-monitor-grace-period

geerlingguy · 2019-04-05T20:10:33Z

I would also very much like to have control over HPA scaling delays since there's no other way to do it:

--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay

whereisaaron · 2019-04-06T04:25:30Z

@BrianChristie BTW, if you like you can monitor for spot node terminator and evict the pods cleanly before termination.

savar · 2019-04-17T15:32:54Z

also --horizontal-pod-autoscaler-cpu-initialization-period and --horizontal-pod-autoscaler-downscale-stabilization as if one of hour hpa is failing miserably a second one actually only scales within the CPU utilization but as they are limited and only can go up to almost twice the "wished" target, we only can scale up by 2 each run (https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) which means with 16 pods running we only grow to 32.. and then it takes 5mins before it scales to 64 and then another 5mins to 128.. if the other HPA which is failing at that time had 800 pods running and is dropping to 300, then it takes like ages to cover the missing 500 pods

echoboomer · 2019-08-14T15:32:48Z

Are there plans to allow passing in any amount of parameters from something like https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ (specifically --terminated-pod-gc-threshold) or is the plan to only allow customizing certain parameters?

eladitzhakian · 2019-08-28T11:21:57Z

Could also use the ability to modify

--horizontal-pod-autoscaler-use-rest-clients

Since I'm having problems with HPA and metrics-server and can't view or configure it

mebuzz · 2019-09-05T05:36:45Z

Looks like more and more people adapting k8s on eks are in urgent need of these customizations. Specifically the one already mentioned,
--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay
and
--pod-eviction-timeout

Unable to meet worker nodes patching requirements. (although draining helps a little, but not enough to comply)

ghost · 2019-09-09T15:13:27Z

Actually 5 minute is sometimes too long to delete pods on failed nodes.
--pod-eviction-timeout duration should be enabled on EKS too.

chillybug · 2019-09-18T06:59:57Z

I really need to set below one!
--horizontal-pod-autoscaler-upscale-delay

gillbee · 2019-11-12T17:13:17Z

Any updates? We're also looking for the ability to configure these values.

PaulMaddox · 2019-11-26T13:59:32Z

As an interim workaround, instead of using --pod-eviction-timeout, can you use Taint Based Evictions to set this on a per-pod basis? This is supported in EKS clusters running 1.13+.

There's an example in this issue: kubernetes/kubernetes#74651

echoboomer · 2019-11-27T00:57:12Z

Not sure if this works for everybody or everything but I recently noticed this in the AWS EKS node AMI:

https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L14

Notice the use of $KUBELET_ARGS $KUBELET_EXTRA_ARGS here - we were able to pass in my original requirement of --terminated-pod-gc-threshold this way, but I'm not entirely certain that a) AWS honors things placed here or b) these work with master-node abstraction.

ChrisCooney · 2019-11-27T11:14:24Z

Not sure if this works for everybody or everything but I recently noticed this in the AWS EKS node AMI:

https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L14

Notice the use of $KUBELET_ARGS $KUBELET_EXTRA_ARGS here - we were able to pass in my original requirement of --terminated-pod-gc-threshold this way, but I'm not entirely certain that a) AWS honors things placed here or b) these work with master-node abstraction.

Yeah, this means you can configure the Kubelet on the node. Alas, it doesn't allow us to configure the kubernetes control plane.

shivarajai · 2020-01-03T10:28:39Z

can you allow the ability to modify the below flags for the kube-controller-manager fo us to be able to manage the col down delay aside from the default 5 minutes:
--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay

jicowan · 2020-03-17T20:44:42Z

you could use this instead, https://blog.postmates.com/configurable-horizontal-pod-autoscaler-81f48779abfc

starchx · 2020-04-13T09:00:54Z

Add:

--terminated-pod-gc-threshold

calebwoofenden · 2020-05-14T20:47:52Z

Jumping in to request that --horizontal-pod-autoscaler-initial-readiness-delay also be added. We are running an HPA in our EKS clusters and are unable to fully configure it how we would like.

I'm not sure why kube chose to have all of these HPA-related configs go on the controller manager instead of being configured on the HPA resource itself, but that's another story.

mikestef9 · 2020-05-27T16:41:55Z

Note that 1.18 adds support configurable scaling behavior

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-configurable-scaling-behavior

So this will be possible once EKS supports 1.18

danijelk · 2020-11-23T08:10:04Z

Still with 1.18 it doesn't seem to bite

error validating data: ValidationError(HorizontalPodAutoscaler.spec): unknown field "behavior" in io.k8s.api.autoscaling.v2beta1.HorizontalPodAutoscalerSpec;

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.8-eks-7c9bda", GitCommit:"7c9bda52c425d0d56d7b93f1377a826b4132c05c", GitTreeState:"clean", BuildDate:"2020-08-28T23:04:33Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

toricls · 2020-11-23T08:15:48Z

@danijelk try v2beta2 for it.

danijelk · 2020-11-23T09:13:33Z

@toricls Ah, didn't see I was on beta1, k8s accepted it now thanks.

aniruddhch · 2020-12-30T07:43:35Z

Is there a way to set the --terminated-pod-gc-threshold on the Kube-controller-manager with EKS? A solution was suggested earlier about specifying the parameters in the AMI. Is that a recommended way to do it for now? Although, that would mean having a custom AMI that needs to be updated every time there is a new AMI version for EKS.

tabern · 2021-03-23T20:49:17Z

Closing this as setting these flags is supported in K8s v1.18 and higher.

jerry123je · 2021-04-03T06:28:18Z

@tabern,
I understand the hpa.v2beta2 have ability to add behavior configuration, this resolve part of requests.
However, i just curios that how can we set pod-eviction-timeout after k8s v1.18 without modifying kube-controller-manager ?

EdwinPhilip · 2021-04-20T06:13:31Z

need horizontal-pod-autoscaler-initial-readiness-delay flag to be configurable in eks, but thats not possible till now. any info on how to configure it for eks ?

lmgnid · 2021-05-03T23:32:25Z

Not sure why this ticket is closed and "Shipped"? How to set "pod-eviction-timeout" ???

mibaboo · 2021-05-25T15:36:02Z

I too require horizontal-pod-autoscaler-initial-readiness-delay on EKS and the scaling-behavior does not support this

emmeowzing · 2021-09-15T19:20:36Z

It doesn't look like I can modify --horizontal-pod-autoscaler-sync-period either.

yongzhang · 2021-12-20T07:41:37Z

also need to customize pod-eviction-timeout

sjortiz · 2021-12-30T14:51:31Z

Needing this urgently :)

marcusthelin · 2022-01-09T15:55:18Z

No status on this??

TaiSHiNet · 2022-03-10T10:34:29Z

For everyone who's following this, see #1544

dwgillies-bluescape · 2022-05-16T20:13:31Z

+1 to allow setting of the --terminated-pod-gc-threshold setting. Evicted pods are piling up in our dev clusters and the default limit of 12,500 evicted pods before garbage collection begins is way too high! We would like to reduce it to 100 !

michaelmohamed · 2022-05-23T01:42:27Z

Is there an update on this? I really need the ability to set terminated-pod-gc-threshold to use EKS.

PrettySolution · 2022-06-06T17:36:10Z

I'd like to set terminated-pod-gc-threshold to use EKS

aaronmell · 2022-07-26T17:58:36Z

FYI, we thought we needed to increase horizontal-pod-autoscaler-initial-readiness-delay, to solve an issue with autoscaling being too aggressive after rolling out new pods, and causing scaling to max out.

Our issue was actually the custom metrics we were scaling on. We were doing something like this
sum(rate(container_cpu_cfs_throttled_seconds_total[1m])) The issue here is that we collect metrics every 30s, and container_cpu_cfs_throttled_seconds_total doesn't increased in a linear fashion, it tends to increase in in spurts.

We changed the rate from 1m to 2m, and that smoothed things out quite a bit and fixed our issue with aggressively scaling up.

This SO post has some good information about rate in Prometheus

https://stackoverflow.com/questions/38915018/prometheus-rate-functions-and-interval-selections

mtcode · 2022-11-01T19:53:05Z

--horizontal-pod-autoscaler-tolerance is another flag that is only customizable via controller manager flags. The v2beta2 API does not allow configuring this.

The default is 10% but I have use cases where the value should be less, making it more sensitive and responsive to changes.

sftim · 2023-02-27T09:29:30Z

Does the kube-controller-manager still support a --pod-eviction-timeout argument? The docs imply it was removed in v1.24.0 and the changelog implies it'll be removed in v1.27

daynekoroman · 2023-12-04T09:46:41Z

The default pod-eviction-timeout 5m doesn't provide opportunity to make graceful shutdown for pods on spot nodes, because when spot node goes down, we have pod running and ready until healthcheck interval, and it follows us to get 502 error from ALB

xzp1990 · 2024-01-03T03:27:00Z

Hi team, 5 minutes is too long for node issue, we hope service team can allow the user to change below setting.
–node-status-update-frequency
–node-monitor-period
–node-monitor-grace-period
–pod-eviction-timeout

ChrisCooney added the Proposed Community submitted issue label Feb 10, 2019

abby-fuller added the EKS Amazon Elastic Kubernetes Service label Feb 12, 2019

tabern changed the title ~~[EKS] [request]: Configure Control Plane Values~~ [EKS] [request]: Ability to configure pod-eviction-timeout Feb 15, 2019

tabern added this to Researching in containers-roadmap Feb 16, 2019

dbirks mentioned this issue Aug 13, 2019

How do I configure Kube controller? awslabs/amazon-eks-ami#176

Closed

tabern closed this as completed Mar 23, 2021

containers-roadmap automation moved this from Researching to Just Shipped Mar 23, 2021

mikestef9 reopened this May 4, 2021

juris mentioned this issue Jul 16, 2021

HorizontalPodAutoscaler causes degraded status argoproj/argo-cd#6287

Open

3 tasks

mikestef9 moved this from Just Shipped to We're Working On It in containers-roadmap Sep 27, 2021

mikestef9 moved this from We're Working On It to Researching in containers-roadmap Oct 21, 2021

plumdog mentioned this issue Jun 22, 2023

Init container default timeout behaviour not documented kubernetes/website#41734

Closed

[EKS] [request]: Ability to configure pod-eviction-timeout #159

[EKS] [request]: Ability to configure pod-eviction-timeout #159

Comments

ChrisCooney commented Feb 10, 2019 • edited

tabern commented Feb 15, 2019

ChrisCooney commented Feb 15, 2019

tabern commented Feb 15, 2019

BrianChristie commented Feb 20, 2019

dawidmalina commented Feb 20, 2019

geerlingguy commented Apr 5, 2019

whereisaaron commented Apr 6, 2019

savar commented Apr 17, 2019 • edited

echoboomer commented Aug 14, 2019

eladitzhakian commented Aug 28, 2019

mebuzz commented Sep 5, 2019 • edited

ghost commented Sep 9, 2019 • edited by ghost

chillybug commented Sep 18, 2019

gillbee commented Nov 12, 2019

PaulMaddox commented Nov 26, 2019 • edited

echoboomer commented Nov 27, 2019

ChrisCooney commented Nov 27, 2019

shivarajai commented Jan 3, 2020

jicowan commented Mar 17, 2020

starchx commented Apr 13, 2020

calebwoofenden commented May 14, 2020 • edited

mikestef9 commented May 27, 2020

danijelk commented Nov 23, 2020

toricls commented Nov 23, 2020

danijelk commented Nov 23, 2020

aniruddhch commented Dec 30, 2020

tabern commented Mar 23, 2021

jerry123je commented Apr 3, 2021

EdwinPhilip commented Apr 20, 2021

lmgnid commented May 3, 2021

mibaboo commented May 25, 2021

emmeowzing commented Sep 15, 2021

yongzhang commented Dec 20, 2021

sjortiz commented Dec 30, 2021

marcusthelin commented Jan 9, 2022

TaiSHiNet commented Mar 10, 2022

dwgillies-bluescape commented May 16, 2022 • edited

michaelmohamed commented May 23, 2022

PrettySolution commented Jun 6, 2022 • edited

aaronmell commented Jul 26, 2022

mtcode commented Nov 1, 2022

sftim commented Feb 27, 2023

daynekoroman commented Dec 4, 2023 • edited

xzp1990 commented Jan 3, 2024

ChrisCooney commented Feb 10, 2019 •

edited

savar commented Apr 17, 2019 •

edited

mebuzz commented Sep 5, 2019 •

edited

ghost commented Sep 9, 2019 •

edited by ghost

PaulMaddox commented Nov 26, 2019 •

edited

calebwoofenden commented May 14, 2020 •

edited

dwgillies-bluescape commented May 16, 2022 •

edited

PrettySolution commented Jun 6, 2022 •

edited

daynekoroman commented Dec 4, 2023 •

edited