New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignored pod-eviction-timeout settings #74651
Comments
@kubernetes/sig-node-bugs |
@danielloczi: Reiterating the mentions to trigger a notification: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I also ran into this issue while testing setting the eviction timeout lower. After poking around at this for sometime I figured out that the cause is the new TaintBasedEvictions.
Setting the feature flag for this to false causes pods to be evicted like expected. I have not taken time to search through the taint based eviction code but I would guess that we are not utilizing this eviction timeout flag within it. |
Looking into this more. With TaintBasedEvictions set to true you can set your pods eviction time within its spec under tolerations: |
|
Thanks for your feedback ChiefAlexander!
So I just simply added my own values to the deployment:
After applying the deployment in case of node failure, node status changes to "NotReady", then pods re-created after 2 seconds. So we don't have to deal with pod-eviction-timeout anymore, timeout can be set on Pod basis! Cool! Thanks again for your help! |
@danielloczi Hi danielloczi , How do you fix this issue? I also meet this issue |
@323929 I think @danielloczi doesn't care about the |
That is right: I simply started to use |
Is it possible to make it global? I don't want to enable that for each pod config, especially that I use a lot of prepared things from helm |
+1 for having the possibility to configure it per whole cluster. tuning per pod or per deployment is rarely useful: in most cases a sane global value is waaay more convenient and the current default of 5m is waaay to long for many cases. please please reopen this issue. |
I am facing this same problem, Is there a way to unenable Taint based Evictions and that pod-eviction-timeout works in global mode? |
I think that you can configure global pod eviction via apiserver: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/ |
Why had this bug been marked as closed? It does look like the original issue is not solved, but only work-arounded. |
same issue |
I use those lines in deployment - as others say, global/cluster setting is better. |
may you need set this for kube-apiserver :
|
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
… kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set.
* Allow instantiating v1.26 Kubernetes clients * Update `README.md` and `docs/usage/supported_k8s_versions.md` for the K8s 1.26 * Maintain Kubernetes feature gates $ ./hack/compare-k8s-feature-gates.sh 1.25 1.26 Feature gates added in 1.26 compared to 1.25: APISelfSubjectReview AggregatedDiscoveryEndpoint ConsistentHTTPGetHandlers CrossNamespaceVolumeDataSource DynamicResourceAllocation EventedPLEG LegacyServiceAccountTokenTracking MinimizeIPTablesRestore PDBUnhealthyPodEvictionPolicy PodSchedulingReadiness StatefulSetStartOrdinal TopologyManagerPolicyAlphaOptions TopologyManagerPolicyBetaOptions TopologyManagerPolicyOptions ValidatingAdmissionPolicy WindowsHostNetwork Feature gates removed in 1.26 compared to 1.25: CSIMigrationOpenStack CSRDuration DefaultPodTopologySpread DynamicKubeletConfig IndexedJob NonPreemptingPriority PodAffinityNamespaceSelector PodOverhead PreferNominatedNode ServiceLBNodePortControl ServiceLoadBalancerClass SuspendJob Feature gates locked to default in 1.26 compared to 1.25: CPUManager CSIMigrationvSphere DelegateFSGroupToCSIDriver DevicePlugins DryRun EndpointSliceTerminatingCondition JobTrackingWithFinalizers KubeletCredentialProviders MixedProtocolLBService ServerSideApply ServiceIPStaticSubrange ServiceInternalTrafficPolicy WindowsHostProcessContainers * Maintain `kube-apiserver` admission plugins $ ./hack/compare-k8s-admission-plugins.sh 1.25 1.26 Admission plugins added in 1.26 compared to 1.25: ValidatingAdmissionPolicy Admission plugins removed in 1.26 compared to 1.25: * Maintain `ServiceAccount` names for the controllers part of `kube-controller-manager` $ ./hack/compare-k8s-controllers.sh 1.25 1.26 kube-controller-manager controllers added in 1.26 compared to 1.25: resource-claim-controller kube-controller-manager controllers removed in 1.26 compared to 1.25: * Use 1.26 for local shoot and local e2e test * Deprecate the `podEvictionTimeout` field in favor of newly introduced kube-apiserver fields The kube-controller-manager flag `--pod-eviction-timeout` is deprecated in favor of the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `--pod-eviction-timeout` flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. For more details, see kubernetes/kubernetes#74651. This commit allows configuring the kube-apiserver flags `--default-not-ready-toleration-seconds` and `--default-unreachable-toleration-seconds`. The `podEvictionTimeout` field is deprecated in favor of the newly introduced fields. gardener-apiserver no longer defaults the `podEvictionTimeout` field. gardener-apiserver also returns a warning when the `podEvictionTimeout` field is set. * Adapt to the renaming of `etcd_db_total_size_in_bytes` metric to `apiserver_storage_db_total_size_in_bytes` The metric `etcd_db_total_size_in_bytes` is renamed to `apiserver_storage_db_total_size_in_bytes`. Ref kubernetes/kubernetes#113310. * Fix the Pod spec in `simple-load-deployment.yaml.tpl` Test runs of the integration test that uses this template prints the following warning about the issue in the template: ``` {"level":"info","ts":"2022-12-28T19:36:29.043+0200","logger":"KubeAPIWarningLogger","msg":"unknown field \"spec.template.spec.containers[0].nodeName\""} ``` * Update `docs/usage/shoot_credentials_rotation.md` After the removal of support for Kubernetes < 1.20 Shoot clusters (ref #6987), the kubeconfig Secret no longer has the `token` field. Basic auth cannot be enabled for K8s 1.19+ clusters, hence the kubeconfig Secret cannot contain the `username`/`password` fields anymore. * Default `enableStaticTokenKubeconfig` to false for Shoots with K8s version >= 1.26 This commit also adapts most of the testmachinery integration tests to use the `shoots/adminkubeconfig` subresource instead of the static kubeconfig. The Shoot creation intergration is still using the static kubeconfig and it is downloading it to `$TM_KUBECONFIG_PATH/shoot.config`. This commit sets `enableStaticTokenKubeconfig=true` until we figure out which tests/components are using this downloaded kubeconfig. * Add constraint for K8s version < 1.26 The constraint `ConstraintK8sLess126` is currently not used by gardener/gardener but it is introduced for usage from the extensions. * Address review comments * Update `new-kubernetes-version.md` guide with details about the `hyperkube` image * Update `supported_k8s_versions.md` for K8s 1.26 * Update kube-scheduler component's unit tests for K8s 1.26 * Revert the K8s versions used for e2e tests For the reasoning, see #7275 (comment)
What happened: I modified the
pod-eviction-timeout
settings of kube-controller-manager on the master node (in order to to decrease the amount of time before k8s re-creates a pod in case of node failure). The default value is 5 minutes, I configured 30 seconds. Using thesudo docker ps --no-trunc | grep "kube-controller-manager"
command I checked that the modification was successfully applied:I applied a basic deployment with two replicas:
The first pod created on the first worker node, the second pod created on the second worker node:
To test the correct pod eviction I shutdown the first worker node. After ~1 min the status of the first worker node changed to "NotReady", then
I had to wait +5 minutes (which is the default pod eviction timeout) for pod on the turned off node to be re-created on the other node.
What you expected to happen:
After the node status reports "NotReady", the pod should be re-created on the other node after 30 seconds instead if the default 5 minutes!
How to reproduce it (as minimally and precisely as possible):
Create three nodes. Init Kubernetes on the first node (
sudo kubeadm init
), apply network plugin (kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
), then join the other two nodes (like:kubeadm join 10.0.1.4:6443 --token xdx9y1.z7jc0j7c8g8lpjog --discovery-token-ca-cert-hash sha256:04ae8388f607755c14eed702a23fd47802d5512e092b08add57040a2ae0736ac
).Add pod-eviction-timeout parameter to Kube Controller Manager on the master node:
sudo vi /etc/kubernetes/manifests/kube-controller-manager.yaml
:(the yaml is truncated, only the related first part is showed here).
Check that the settings is applied:
sudo docker ps --no-trunc | grep "kube-controller-manager"
Apply a deployment with two replicas, check that one pod is created on first worker node, the second is created on the second worker node.
Shut down one of the nodes, and check the elapsed time between the event, when the node reports "NotReady" and the pod re-created.
Anything else we need to know?:
I experience the same issue in multi-master environment also.
Environment:
kubectl version
): v1.13.3Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
cat /etc/os-release
): NAME="Ubuntu" VERSION="16.04.5 LTS (Xenial Xerus)"uname -a
): Linux nodetest21 4.15.0-1037-azure Add warnings about self signed certs and MitM attacks. #39~16.04.1-Ubuntu SMP Tue Jan 15 17:20:47 UTC 2019 x86_64 x86_64 x86_64 GNU/LinuxThe text was updated successfully, but these errors were encountered: