Skip to content
This repository has been archived by the owner on Sep 4, 2021. It is now read-only.

Multi-node: GuaranteedUpdate of /registry/minions/<NODE> failed because of a conflict #898

Open
ashwinp opened this issue Aug 5, 2017 · 0 comments

Comments

@ashwinp
Copy link

ashwinp commented Aug 5, 2017

Issue Details:

  • Worker nodes fail to update the node status.
  • kubectl get nodes on the master does not list one or more worker nodes.
  • Issue can be reproduced intermittently.
  • Restarting kubelet and/or kube-apiserver does not help.
  • This isn't a transient failure. The worker nodes are never able to update the status. They never show up in kubectl get nodes.

Setup details:

  • 3 worker nodes, 1 master node, 1 etcd node
  • All nodes run CoreOS-stable-1409.6.0-hvm (ami-00110279)
  • Issue can be reproduced with Kubernetes 1.6.4 as well as 1.7.0.
  • Issue can be reproduced with etcd 3.5.4 as well as 2.7.* (older version).

kubelet on the worker nodes fails to update the worker node status after claiming to have registered successfully:

kubelet-wrapper[1657]: I0804 16:42:15.216223    1657 kubelet_node_status.go:77] Attempting to register node 172.0.60.57
kubelet-wrapper[1657]: I0804 16:42:15.218882    1657 kubelet_node_status.go:80] Successfully registered node 172.0.60.57
kubelet-wrapper[1657]: E0804 16:42:25.230766    1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "172.0.60.57": nodes "172.0.60.57" not found
kubelet-wrapper[1657]: E0804 16:42:25.232449    1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "172.0.60.57": nodes "172.0.60.57" not found

Looking at the Kubernetes API server logs reveals the fact that there is a conflict while updating the node in etcd, due to which the API server deletes the node:

I0804 16:42:15.220414       1 wrap.go:75] GET /api/v1/nodes/172.0.60.57: (736.057µs) 200 

[[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47534]
I0804 16:42:15.227137       1 store.go:329] GuaranteedUpdate of /registry/minions/172.0.60.57 failed because of a conflict, going to retry
I0804 16:42:15.227245       1 store.go:329] GuaranteedUpdate of /registry/minions/172.0.60.57 failed because of a conflict, going to retry

I0804 16:42:15.227280       1 wrap.go:75] GET /api/v1/pods?fieldSelector=spec.nodeName%3D172.0.60.57: (7.793419ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:46858]
I0804 16:42:15.227314       1 wrap.go:75] PUT /api/v1/nodes/172.0.60.57: (6.490089ms) 409 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47534]
I0804 16:42:15.227250       1 wrap.go:75] PATCH /api/v1/nodes/172.0.60.57: (6.805385ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/ttl-controller] 127.0.0.1:47536]
I0804 16:42:15.228557       1 wrap.go:75] GET /api/v1/nodes/172.0.60.57: (708.958µs) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:46858]
I0804 16:42:15.228820       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/172.0.60.57.14d7b23385905550: (11.479188ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd] 172.0.60.57:59454]
I0804 16:42:15.228837       1 wrap.go:75] PATCH /api/v1/nodes/172.0.60.57/status: (6.707276ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd] 172.0.60.57:59454]
I0804 16:42:15.229323       1 wrap.go:75] PUT /api/v1/nodes/172.0.60.57: (406.754µs) 409 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47536]
I0804 16:42:15.230566       1 wrap.go:75] GET /api/v1/nodes/172.0.60.57: (719.769µs) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47536]
I0804 16:42:15.232358       1 wrap.go:75] PUT /api/v1/nodes/172.0.60.57: (1.469816ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47536]
I0804 16:42:15.232840       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/172.0.60.57.14d7b2338590686a: (3.188002ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd] 172.0.60.57:59454]
I0804 16:42:15.235985       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/172.0.60.57.14d7b23385907c23: (2.451278ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd] 172.0.60.57:59454]

I0804 16:42:17.732567       1 wrap.go:75] DELETE /api/v1/nodes/172.0.60.57: (2.582459ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller] 127.0.0.1:47534]
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant