kubeadm-1.15.3: pod-eviction-timeout is ignored #2

kskmori · 2019-08-29T04:15:00Z

revision: 1d9a9b6 2019-08-29 Update versions to kubernetes 1.15.3 and the latest documents

It takes 5 minutes until the pods are evicted after a node failure regardless of the kubeadm init config below. It has been working as expected in kubeadm-1.11.3.

controllerManager:
  extraArgs:
    node-monitor-grace-period: "20s"
    pod-eviction-timeout: "40s"

Diagnosis:

As of 1.13, Taint based Evictions is enabled and needs to be configured for it instead of pod-eviction-timeout.

[root@master osc2018tk-demo]# kubectl describe pod postgres-0 | grep Tolerations: -A 1
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s

Ref.

The text was updated successfully, but these errors were encountered:

…ion-timeout (fixes #2) NOTE: Taint based Evictions timeout starts since the node status changed to NotReady (or Unreachable) so it would take 40s in total after the time of the actual failure: 40s = node-monitor-grace-period(20s) + default-not-ready-toleration-seconds(20s) as it's equal to pod-eviction-timeout=40s

kskmori · 2019-08-29T05:09:21Z

fixed in 26076cb

[root@master osc2018tk-demo]# kubectl describe pod postgres-0 | grep Tolerations: -A 1
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 20s
                 node.kubernetes.io/unreachable:NoExecute for 20s

kskmori · 2019-08-29T05:12:22Z

Another note: the pod status is now shown as "Terminating" instead of "Unknown" in 1.11, but the service availability is same (can not fail over in the event of a node failure).

[root@master osc2018tk-demo]# kubectl get pods -o wide
NAME                     READY   STATUS        RESTARTS   AGE     IP           NODE      NOMINATED NODE   READINESS GATES
httpd-84b6977f6d-dhkrn   1/1     Running       0          3m46s   10.244.1.3   worker2   <none>           <none>
httpd-84b6977f6d-fc5rd   1/1     Terminating   0          9m40s   10.244.2.3   worker1   <none>           <none>
httpd-84b6977f6d-pfwjt   1/1     Running       0          9m40s   10.244.1.2   worker2   <none>           <none>
postgres-0               1/1     Terminating   0          9m43s   10.244.2.2   worker1   <none>           <none>

kskmori closed this as completed Aug 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubeadm-1.15.3: pod-eviction-timeout is ignored #2

kubeadm-1.15.3: pod-eviction-timeout is ignored #2

kskmori commented Aug 29, 2019 •

edited

kskmori commented Aug 29, 2019

kskmori commented Aug 29, 2019

kubeadm-1.15.3: pod-eviction-timeout is ignored #2

kubeadm-1.15.3: pod-eviction-timeout is ignored #2

Comments

kskmori commented Aug 29, 2019 • edited

kskmori commented Aug 29, 2019

kskmori commented Aug 29, 2019

kskmori commented Aug 29, 2019 •

edited