Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After stopping docker, previously running containers cannot be started or removed #5684

Closed
vieira opened this issue May 8, 2014 · 113 comments
Closed
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.

Comments

@vieira
Copy link

vieira commented May 8, 2014

The issue can be reproduced as follows:

$ docker run -d ubuntu:trusty tail -f /dev/null
c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5

$ stop docker
docker stop/waiting

$ start docker
docker start/running, process 2389

$ docker ps -q
# prints nothing...

$ docker ps -a -q
c39206003c7a

$ docker start c39206003c7a
Error: Cannot start container c39206003c7a: Error getting container c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5 from driver devicemapper: Error mounting '/dev/mapper/docker-253:0-267081-c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5' on '/var/lib/docker/devicemapper/mnt/c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5': device or resource busy
2014/05/08 19:14:57 Error: failed to start one or more containers

$ docker rm c39206003c7a
Error: Cannot destroy container c39206003c7a: Driver devicemapper failed to remove root filesystem c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5: Error running removeDevice
2014/05/08 19:15:15 Error: failed to remove one or more containers

This is an up to date Ubuntu 14.04 host running lxc-docker 0.11.1. Storage driver is devicemapper and kernel version is 3.13.0.

This is a regression from docker 0.9 (from the official Ubuntu repos). The problem is also present in 0.10.

@unclejack
Copy link
Contributor

@vieira Please reboot the machine and let us know if you're still having troubles.

@vieira
Copy link
Author

vieira commented May 8, 2014

The above steps are reproducible even after rebooting the machine.

@vieux
Copy link
Contributor

vieux commented May 19, 2014

@alexlarsson can you please take a look ? It seems to be related to devicemapper

@alexlarsson
Copy link
Contributor

The problem just seems related to the devicemapper. I think its really something else though.
I tried this, and the problem is the "stop docker" part. If i just ctrl-c the docker daemon it will try to properly stop the containers, but it seems like it never ever succeeds in stopping the container. So, i ctrl-c a few more times to force docker to die.

At this point the container (tail) is still running, so the device mapper device will be mounted, which means we can't mount it again, or remove it. This is why these operations fail.

@wyaeld
Copy link

wyaeld commented May 23, 2014

@alexlarsson do you know an easy way to clean up the system once this goes wrong?

@alexlarsson
Copy link
Contributor

Well, if you find the runaway container process maybe you could force kill it.

@crosbymichael crosbymichael modified the milestones: 1.1, 1.0 Jun 11, 2014
@aroragagan
Copy link

@vieira you can unmount:
umount /var/lib/docker/devicemapper/mnt/c39206003c7ae8992a554a9ac2ea130327fc4af1b2c389656c34baf9a56c84b5

and start the container again it should work

@destfinal
Copy link

I can see that my docker was started with -d and -r. First, when docker is restarted, the containers dont get restarted. Then the above mentioned error happens (when trying to start the container(s)).

My centos 6.5 is still getting 1.0.0.6 from the epel. Has this ever been identified as a bug in 1.0 and got fixed in the 1.1? Can somebody please confirm?

Thanks

@vieira
Copy link
Author

vieira commented Jul 18, 2014

Hello everyone, still not fixed in 1.1.1.
The steps in the original post still apply.

Error response from daemon: Cannot start container 5e9bde9b409b: 
Error getting container 5e9bde9b409b001bcc685c0b478e925a53a03bab8d8ef3210bf24aa39410e30d 
from driver devicemapper: 
Error mounting '/dev/mapper/docker-253:0-267081-5e9bde9b409b001bcc685c0b478e925a53a03bab8d8ef3210bf24aa39410e30d' 
on 
'/var/lib/docker/devicemapper/mnt/5e9bde9b409b001bcc685c0b478e925a53a03bab8d8ef3210bf24aa39410e30d': 
device or resource busy

@atticussterman
Copy link

I am getting the a lot as well, but it does seem to remove the container in some sense (in that I can start a new container with the same name)

@karcaw
Copy link

karcaw commented Aug 14, 2014

Is There a work around for this issue?

@codingtony
Copy link

Looking for a workaround as well.

@rochacon
Copy link

Seems like stopping all containers before the docker daemon fix the issue.

I've added this pre-stop block to my upstart job as a workaround:

pre-stop script
    /usr/bin/docker ps -q | xargs /usr/bin/docker stop
end script

Here is a gist with my debugging steps: https://gist.github.com/rochacon/4dfa7bd4de3c5f933f0d

@vieira
Copy link
Author

vieira commented Aug 23, 2014

@rochacon Thanks for your workaround. I will test it today or tomorrow with 1.2 (seems you tested with 1.1.1, right?). Hope it works.

@rochacon
Copy link

@vieira I also tried with 1.2.0, same results.

@marcellodesales
Copy link

After 4 weeks running, one of my containers stopped... Not sure why... How can I found the root cause?

Anyway, I had the same problem... It solved with the suggestion from @aroragagan: umount, docker start container... I'm on RHEL 6.5 by the way...

root@pppdc9prd3ga mdesales]# docker start federated-registry
Error response from daemon: Cannot start container federated-registry: Error getting container 4841fcb6e51f4e9fcd7a115ac3efae4b0fd47e4f785c735e2020d1c479dc3946 from driver devicemapper: Error mounting '/dev/mapper/docker-253:0-394842-4841fcb6e51f4e9fcd7a115ac3efae4b0fd47e4f785c735e2020d1c479dc3946' on '/var/lib/docker/devicemapper/mnt/4841fcb6e51f4e9fcd7a115ac3efae4b0fd47e4f785c735e2020d1c479dc3946': device or resource busy
2014/10/17 21:04:33 Error: failed to start one or more containers

[root@pppdc9prd3ga mdesales]# docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.2
Git commit (client): d84a070/1.1.2
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.2
Git commit (server): d84a070/1.1.2

[root@pppdc9prd3ga mdesales]# umount /var/lib/docker/devicemapper/mnt/4841fcb6e51f4e9fcd7a115ac3efae4b0fd47e4f785c735e2020d1c479dc3946

[root@pppdc9prd3ga mdesales]# docker start federated-registry
federated-registry

@oskapt
Copy link

oskapt commented Oct 22, 2014

We're seeing this on 1.3.0 now, on an EC2 Ubuntu system that was upgraded from 12.04 to 14.04. My dev instance is a direct 14.04 install into Vagrant and does not have this problem. Unmounting and then restarting the containers seems to work, but that defeats the purpose of having them configured to restart automatically when the instance reboots or when docker restarts. Let me know if there's any further information I can provide on versions of supporting packages, etc, since I have a working and non-working system available.

@srobertson
Copy link

Seeing the same issue with docker 1.3 Ubuntu 14.04 with either Linux kernel 3.13 or 3.14.

@thaJeztah
Copy link
Member

@srobertson are you referring to "containers not being restarted when the daemon restarts"? Are you using the new per-container restart-policy? Because the daemon-wide -r / --restart=true has been removed in Docker 1.2

The new (per container) restart-policy is described in the CLI reference

@ivan-kolmychek
Copy link

+1, got this issue on docker 1.3 @ ArchLinux x86_64 with 3.17.2-1-ARCH kernel.

$ docker --version
Docker version 1.3.1, build 4e9bbfa

Umount solves the problem.

@mlehner616
Copy link

umount is a workaround, I wouldn't say it solves the problem. Simply restarting the daemon with containers running will reproduce the issue.

@dsteinkopf
Copy link

dsteinkopf commented Apr 16, 2016

No, the problem already existed using docker 1.10 and with the default ubuntu 14.04-kernel (~3.10 I think) and by using aufs. Then I upgraded (step by step) storage driver, kernel and docker. No significant change in the experienced problem...

Do you think, it's worth trying overlay concerning this problem? (Performance is not a big issue in my case.)

@pascalandy
Copy link

@thaJeztah I never saw this issue before and since I

Did you run into this after upgrading from 1.10 to 1.11

I have this issue :(

@danielwhatmuff
Copy link

Still got this on
RHEL 7.2 kernel-3.10.0-327.el7.x86_64
Docker version 1.9.1, build 78ee77d/1.9.1
device-mapper-libs-1.02.107-5.el7_2.1.x86_64

@guenhter
Copy link

guenhter commented Apr 19, 2016

Also got the issue:

docker rm agent4 Error response from daemon: Driver aufs failed to remove root filesystem 16a3129667975c411d0084b38ba512761b64eaa7853f3452a7f8e4f2898d1175: rename /var/lib/docker/aufs/diff/76125e9141ec9de7c12e20d41b00cb44826b19bedf98bd9c650cb7a7cc07913a /var/lib/docker/aufs/diff/76125e9141ec9de7c12e20d41b00cb44826b19bedf98bd9c650cb7a7cc07913a-removing: device or resource busy

docker version

Client:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:26:49 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:26:49 2016
 OS/Arch:      linux/amd64

docker info

Containers: 9
 Running: 8
 Paused: 0
 Stopped: 1
Images: 80
Server Version: 1.11.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 193
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 7 (wheezy)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.45 GiB
Name: chell
ID: BXX3:THMK:SWD4:FP35:JPVM:3MV4:XJ7S:DREY:O6XO:XYUV:RHXO:KUBS
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

uname -a

Linux chell 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-1 (2016-03-06) x86_64 GNU/Linux

@cpuguy83
Copy link
Member

This is a mix of different issues. I think we need to close this. None of the latest reported cases are anything like the OP.

@guenhter I suspect this is related to another issue with mounting either /var/run into a container (any other container on your host) or mounting /var/lib/docker

@cpuguy83
Copy link
Member

@guenhter For the record #21969

@cpuguy83
Copy link
Member

Also, many of the pre-1.11 issues with "device or resource busy" type errors are most likely from killing the daemon (ungracefully) and then starting it back up.
This causes the internal ref counts on the storage driver mounts to be reset to 0, meanwhile the mounts themselves are still active.
1.11 addresses that case.

Closing for reasons stated above.

@dsteinkopf
Copy link

Sorry - I'm not sure if I understand this. What do you mean by "None of the latest reported cases are anything like the OP" ?
What should I (and others experiencing this problem) do? Open another case?

@cpuguy83
Copy link
Member

@dsteinkopf Yes, with as much detail as you can provide (compose files, daemon logs, etc.).

@chirangaalwis
Copy link

Hi just to note on the issue I have specified earlier, I have upgraded my kernel version to 4.4.0-21-generic and the docker version info are as follows:
Client:
Version: 1.11.0
API version: 1.23
Go version: go1.5.4
Git commit: 4dc5990
Built: Wed Apr 13 18:38:59 2016
OS/Arch: linux/amd64

Server:
Version: 1.11.0
API version: 1.23
Go version: go1.5.4
Git commit: 4dc5990
Built: Wed Apr 13 18:38:59 2016
OS/Arch: linux/amd64

The issue reported earlier seems to have stopped occurring. Used Docker for considerable time by upgrading the kernel versions and it seemed to have stopped.

@GameScripting
Copy link

GameScripting commented May 25, 2016

Found a workarround for the problem, at least when used with docker-compose see #3786 (comment)

@gamesbook
Copy link

Same issue with a container that is failing to restart.

Ubuntu 14.04
Kernel: 3.13.0-24-generic
Docker Version:

Client:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:34:23 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:34:23 2016
 OS/Arch:      linux/amd64

Error:

Error response from daemon: Driver aufs failed to remove root filesystem 
802f3a6eb28f8f16bf8452675a389b1d8bf755e59c827d50bec9372420f4194a: 
rename /var/lib/docker/aufs/diff/79e53988cfddcc3fb9868316bd9d8c3d7a825fd09a8620553c148bd96243224f /var/lib/docker/aufs/diff/79e53988cfddcc3fb9868316bd9d8c3d7a825fd09a8620553c148bd96243224f-removing: 
device or resource busy

Unmount fails:

umount: /var/lib/docker/devicemapper
/mnt/79e53988cfddcc3fb9868316bd9d8c3d7a825fd09a8620553c148bd96243224f is not mounted 
(according to mtab)

@GameScripting
Copy link

GameScripting commented Jun 23, 2016

This still is an issue for us (using 1.11.2 on Ubuntu 14.04.4 LTS (with KVM) (3.13.0-88-generic)).

Is there any open ticket I can subscribe to get updates?

@gamesbook
Copy link

@GameScripting See #21704

AkihiroSuda added a commit to AkihiroSuda/issues-docker that referenced this issue Jul 28, 2016
…e discussed in the ticket and hence misleading
@eromoe
Copy link

eromoe commented Sep 26, 2016

Linux zk1 3.10.0-327.28.3.el7.x86_64(centos 7)
Docker version 1.12.1, build 23cf638

Error response from daemon: Driver devicemapper failed to remove root filesystem 228f2c2da3de4d5abd3881184aeb330a4c18e4311ecf404e2fb8cd4ffe15e901: devicemapper: Error running DeleteDevice dm_task_run failed

@perlun
Copy link

perlun commented Oct 6, 2016

Just ran into this. /etc/init.d/docker restart helped, I'm happy this wasn't on a production machine... 😢

$ docker --version
Docker version 1.11.1, build 5604cbe

@Risto-Stevcev
Copy link

Still getting this too

$ docker --version
Docker version 1.12.2, build bb80604

@SEAPUNK
Copy link

SEAPUNK commented Oct 28, 2016

Same issue, has been happening over many many versions of Docker. I use docker-compose to recreate containers. Sometimes it works cleanly, sometimes it doesn't. Restarting the docker daemon or rebooting the server cleans up the bad container.

Arch Linux; devicemapper containers on ext4 FS.

$ docker version
Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        Thu Oct 27 19:42:59 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.7.3
 Git commit:   6b644ec
 Built:        Thu Oct 27 19:42:59 2016
 OS/Arch:      linux/amd64
$ docker info
Containers: 24
 Running: 22
 Paused: 0
 Stopped: 2
Images: 56
Server Version: 1.12.3
Storage Driver: devicemapper
 Pool Name: docker-8:3-13500430-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 9.394 GB
 Data Space Total: 107.4 GB
 Data Space Available: 78.15 GB
 Metadata Space Used: 24.82 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.123 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.135 (2016-09-26)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.7.2-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 30.85 GiB
Name: omega
ID: IR7W:NSNN:F2B3:YP32:YTQJ:OFEB:2XLK:HHCK:HJ33:5K3O:KEHI:SDUB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8
$ df -T
Filesystem     Type     1K-blocks      Used Available Use% Mounted on
dev            devtmpfs  16169500         0  16169500   0% /dev
run            tmpfs     16173076      2712  16170364   1% /run
/dev/sda3      ext4     447260560 371064976  53453004  88% /
tmpfs          tmpfs     16173076         0  16173076   0% /dev/shm
tmpfs          tmpfs     16173076         0  16173076   0% /sys/fs/cgroup
tmpfs          tmpfs     16173076      1144  16171932   1% /tmp
/dev/sda1      ext4        289293     45063    224774  17% /boot
tmpfs          tmpfs      3234612         8   3234604   1% /run/user/1000
/dev/sdb2      ext4     403042160  15056296 367489480   4% /run/media/ivan/backup
/dev/sda4      ext4     480580312 320608988 135536228  71% /run/media/ivan/ARCHIVES
/dev/sdb3      ext4     225472980   1473948 212522604   1% /run/media/ivan/data

@ghost
Copy link

ghost commented Oct 28, 2016

If it helps...

I believe that I am having the same/similar issue here as well. If deploy a service using compose up -d and then update the image name to a different one in the compose.yaml and do another compose up -d the compose fails with error around devicemapper:

Error
ERROR: for <> Driver devicemapper failed to remove root filesystem 216c098e0f051407863934c27111bd1e9b7561dff1c4d67c0f0d45a99505fa70: Device is Busy

Version Information:
docker version
Client:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built:
OS/Arch: linux/amd64

Server:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built:
OS/Arch: linux/amd64

As a temporary workaround, i have added a docker-compose down --rmi all prior to rerunning the up.

@biolounge
Copy link

I also have the same issue in Docker version: 1.12.3

@SEAPUNK
Copy link

SEAPUNK commented Dec 6, 2016

I'm pretty sure the rest of the people who are experiencing this issue is related to #27381

@bladedoyle
Copy link

bladedoyle commented Dec 13, 2016

Im seeing this in docker 1.12.3 on CentOs 7

dc2-elk-02:/root/staging/ls-helper$ docker --version
Docker version 1.12.3, build 6b644ec
dc2-elk-02:/root/staging/ls-helper$ uname -a
Linux dc2-elk-02 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
dc2-elk-02:/root/staging/ls-helper$ docker rm ls-helper
Error response from daemon: Driver devicemapper failed to remove root filesystem e1b9cdeb519d2f4bea53a552c8b76c1085650aa76c1fb90c8e22cac9c2e18830: Device is Busy

P.S. I am not using docker compose.

@drzraf
Copy link

drzraf commented Jun 9, 2017

Bitten after after host going out-of-disk-space.
Any command affecting the mount point hangs (including "docker ps", "sync", "ls ", ...)

@rakeshzingade
Copy link

rakeshzingade commented Dec 16, 2018

I had similar issue, I saw these error likes in my /var/log/syslog file:
Dec 16 14:32:18 rzing dockerd[3093]: time="2018-12-16T14:32:18.627417173+05:30" level=error msg="Failed to load container mount 00d7b9d64ff6c465276e67f5a5e3642ebacd9616c7602d4361b3a7fab038510a: mount does not exist" Dec 16 14:32:18 rzing dockerd[3093]: time="2018-12-16T14:32:18.627816711+05:30" level=error msg="Failed to load container mount fb108b942f8ed87a9e1affb6480ed477a8f5f823b2639e36348cde4a97924c5e: mount does not exist"
I tried searching the mount point under /var/lib/docker/volumes but didn't find any
finally reboot the system fixed the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Projects
None yet
Development

No branches or pull requests