Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"kubeadm reset" hangs forever intermittently #688

Closed
vhosakot opened this issue Feb 1, 2018 · 6 comments
Closed

"kubeadm reset" hangs forever intermittently #688

vhosakot opened this issue Feb 1, 2018 · 6 comments
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@vhosakot
Copy link

vhosakot commented Feb 1, 2018

BUG REPORT:

I see kubeadm version 1.7 hangs forever sometimes at

[reset] Removing kubernetes-managed containers

I am able to reproduce this issue 4 out of 5 times, and had to press CTRL+C to exit.

$ sudo kubeadm reset
sudo: unable to resolve host vhosakot-aci-1-w9c2681796d
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
^C  <--- Pressed CTRL+C to exit
$

Restarting docker by doing sudo systemctl restart docker.service resolves this issue and sudo kubeadm reset works fine without issues.

It will be nice if kubeadm reset checks the health of docker when removing kubernetes-managed containers and times out if docker in unhealthy, instead of hanging forever at:

[reset] Removing kubernetes-managed containers

kubeadm version:

$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.11", GitCommit:"b13f2fd682d56eab7a6a2b5a1cab1a3d2c8bdd55", GitTreeState:"clean", BuildDate:"2017-11-25T17:51:39Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.11", GitCommit:"b13f2fd682d56eab7a6a2b5a1cab1a3d2c8bdd55", GitTreeState:"clean", BuildDate:"2017-11-25T18:34:52Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.12", GitCommit:"3bda299a6414b4866f179921610d6738206a18fe", GitTreeState:"clean", BuildDate:"2017-12-29T08:39:49Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • OS:
    Ubuntu xenial VM:
$ sudo cat /etc/os-release
sudo: unable to resolve host vhosakot-aci-1-m3710102e28
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel:
$ uname -a
Linux vhosakot-aci-1-m3710102e28 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Others:
$ sudo docker info
sudo: unable to resolve host vhosakot-aci-1-w9c2681796d
Containers: 13
 Running: 9
 Paused: 0
 Stopped: 4
Images: 11
Server Version: 1.13.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 67
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version:  (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: N/A (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-104-generic
Operating System: Ubuntu 16.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 3
Total Memory: 15.67 GiB
Name: vhosakot-aci-1-w9c2681796d
ID: X3ES:DTHR:RVNF:6OCE:2UXA:5VKA:LVVS:2G4K:KYHN:EBQZ:QH4C:SIQB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

$ sudo docker version
sudo: unable to resolve host vhosakot-aci-1-w9c2681796d
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64
 Experimental: false

What happened?

kubeadm reset hangs forever.

What you expected to happen?

kubeadm reset times out if docker in unhealthy.

How to reproduce it (as minimally and precisely as possible)?

See steps above.

@cwedgwood
Copy link

i see this too:

sudo kubeadm reset
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers

what looks to be happening is docker kill/rm gets upset because of a wedged nfs mount

@vhosakot could it perhaps be something like that as well for you? that is stuck mount point?

@vhosakot
Copy link
Author

@cwedgwood I don't have any mount point and still see this issue.

Restarting docker by doing sudo systemctl restart docker.service resolves this issue and sudo kubeadm reset works fine without issues.

@timothysc
Copy link
Member

/assign @detiber - this is a docs update.

@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/documentation Categorizes issue or PR as related to documentation. labels Apr 6, 2018
@timothysc
Copy link
Member

/assign @chuckha

@timothysc timothysc added this to the v1.11 milestone Apr 6, 2018
@chuckha chuckha added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 25, 2018
@chuckha chuckha removed this from the v1.11 milestone Apr 25, 2018
@neolit123
Copy link
Member

i can take this.

adding info in https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/
unless there is a better location

@neolit123
Copy link
Member

^ PR sent to /website. we don't have enough debug information about this, but i've added a note in the docs with the fix that @vhosakot provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

5 participants