Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled pods cannot be started in docker. Container name already in used. #25359

Closed
rodcloutier opened this issue Feb 11, 2020 · 3 comments
Closed

Comments

@rodcloutier
Copy link

What kind of request is this (question/bug/enhancement/feature request):

Bug (on Rancher 2.2.9, haven't tried 2.3.x)

Steps to reproduce (least amount of steps as possible):

  • Note: Issue is sporadic and hard to reproduce. We are working on trying to create reliable reproduction steps.
  • Create a cluster through Rancher
  • Perform several pods rescheduling

Expected Result:

  • With normal condition, available resources, all containers from pods should be able to be scheduled and started in docker daemon.

Actual Result:

  • Observed pods unable to be started with the following event pattern:
    Warning FailedCreatePodSandBox Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "test-64cd57b5c4-rk5bs": Error response from daemon: Conflict. The container name "/k8s_POD_test-64cd57b5c4-rk5bs_default_7c8ebf47-42a1-11ea-855b-fa163e9c2fd4_0" is already in use by container "be4f2ae1acbc90a7ce6d06a978c9080993d7fae6c6954e46646c149bb3d4755f". You have to remove (or rename) that container to be able to reuse that name. 2 minutes ago
    
  • Running docker ps -a on the targeted node does not list the offending container

Current workaround

  1. Drain the node
  2. Restart Docker in the node or, as an alternative, reboot the node
    $ rancher ssh <node>
    $ systemctl restart docker.service
    
  3. Uncordon (draining the node will, in fact, cordon the node) from the UI or using the following command:
    $ kubectl uncordon <node>
    

Other details that may be helpful:

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI):
Rancher: v2.2.9
User Interface: v2.2.98
Helm: v2.10.0-rancher11
Machine: v0.15.0-rancher8-1
  • Installation option (single install/HA): HA deployment with 2 replicas on K8s.

Rancher Cluster information

  • Cluster type: Hosted
  • Machine type and specifications (CPU/memory): VM
  • Kubernetes version (use kubectl version): 1.13.5
  • Docker version (use docker version):
$ docker version
Client:
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.8
 Git commit:        d7080c1
 Built:             Tue Feb 19 23:07:53 2019
 OS/Arch:           linux/amd64
 Experimental:      false
Server:
 Engine:
  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d7080c1
  Built:            Tue Feb 19 23:07:53 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Target Cluster information (spawned by Rancher)

  • Kubernetes version (use kubectl version): 1.13.4, 1.13.5
  • Host OS: Seen with CoreOS 2079.4.0 2132.6.0
  • Docker version (use docker version):
$ docker version
Client:
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.8
 Git commit:        d7080c1
 Built:             Tue Feb 19 23:07:53 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d7080c1
  Built:            Tue Feb 19 23:07:53 2019
  OS/Arch:          linux/amd64
  Experimental:     false
  • output of the kubelet and docker will be provided once we can catch or reproduce the error.
@zaggash
Copy link

zaggash commented Feb 18, 2020

I think we can close this issue.

The issue is introduced by a change in Docker 17.04.
I see you are using k8s 1.13.5
Is it the downstream cluster version ?
The PR has been cherry picked to k8s 1.13 too.
kubernetes/kubernetes#79623
kubernetes/kubernetes#80758

Looks like it has been merged on Sep 11, 2019 so the fixed release is v1.13.11 published_at 2019-09-18T16:24:07Z

@rodcloutier
Copy link
Author

Yes we can close this issue.
It was fixed in version 1.13.11, 1.14.7 and 1.15.4

@Teja1126
Copy link

Teja1126 commented Dec 7, 2020

@zaggash

same issue we are observing while using below version

using k8s version 1.19.0

Docker version 19.03.13

Scheduled pods cannot be started in docker. Container name already in used
this issue frequently obsorved when i do reboot nodes/master in the k8s cluster

Warning Failed 3m48s (x4 over 5m51s) kubelet, master-3 Error: Error response from daemon: Conflict. The container name "/k8s_kube-apiserver_kube-apiserver-master-3_kube-system_1d25fb42cda5d90beda502e06a30a585_4" is already in use by container "04be72e367e5e30be717c10e9ef33dc6be7510653777af300aa67c1714b666fe". You have to remove (or rename) that container to be able to reuse that name.
Warning BackOff 3m33s (x10 over 5m50s) kubelet, master-3 Back-off restarting failed container
Normal Pulled 36s (x11 over ) kubelet, master-3 Container image "artifactory.radisys.com:8088/k8s.gcr.io/kube-apiserver:v1.19.0" already present on machine
Normal SandboxChanged kubelet, master-3 Pod sandbox changed, it will be killed and re-created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants