New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device-mapper does not release free space from removed images #3182
Comments
+1, I'm very interested in hearing some discussion on this subject. My strategy so far has been
@AaronFriel, which version of Docker are you on? 0.7.1? |
Starting from a fresh /var/lib/docker:
Then after docker pull busybox it grew a bit:
docker rmi busybox does not make the file larger, but makes the space free in the devicemapper pool:
The loopback file doesn't grow if we download the image again:
So, it seems like we fail to re-sparsify the loopback file when the thinp device discards a block. |
However, if i create a file inside the container fs image it does reclaim the space in the loopback file.
This grew the data file from 299M to 344M. But when i removed a_file.bin (and waited a bit) it got back to 299M. So, this seems to me like it is a devicemapper bug. I.e. it forwards discards from the thinp device to the underlying device, but it doesn't discard when removing thinp devices from the pool. |
This seem to be a kernel issue, I was looking at working around it bu using BLKDISCARD, but i failed. See this bugs for some details: https://bugzilla.redhat.com/show_bug.cgi?id=1043527 |
I put my workaround in https://github.com/alexlarsson/docker/tree/blkdiscard, but we're still researching if we can do better than this. |
Upstream dm comments on this issue: |
Having this problem on CentOS (2.6.32-358.23.2.el6.x86_64) with Docker 0.7.0, as well. Old, but the problem's not isolated to Ubuntu. |
Same issue on Arch GNU/Linux 3.12.6-1-ARCH, Docker version 0.7.2. |
Still exists on 0.7.0 on CentOS. |
Still exists in 0.7.2 on ubuntu 12.04.3 LTS. A lot of the space is in It's neat that I learned you can see the container file systems in but it's not neat that my hard disk is almost completely eaten up and only fixable by |
I am having a similar issue while writing docker support for rspec-system. My test VM (docker host) has a 8GB drive and after repeatedly creating images without deleting them my drive fills up. But after removing all images and containers the drive is still 100% full. I figured it was an ID-10T error but just gave up and destroyed the VM all together. |
Still exist in 0.7.5 on ubuntu 13.04. |
This issue has been fixed by PR #3256 which was recently merged. This fix will be included in a future release. I'm closing this issue now because the fix has been merged to master. |
Note: Its not fully fixed until you also run a kernel with http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=6d03f6ac888f2cfc9c840db0b965436d32f1816d in it. Without that the docker fix is only partial. |
What is the work around to remove space. I am using rhel 6.5 and it might be a while to get the new kernel. Sent from my iPhone
|
@logicminds There is no super easy way to recover the space atm. Technically it should be possible to manually re-sparsify the loopback files. But that would require all the non-used blocks to be zeroed or something for easily detection of the sparse areas, which is not done on thinp device removal. |
@alexlarsson Does this also affect OEL 6.5? The OEL 6.5 actually uses the uek 3.8 linux kernel and since I have the option between switching from 2.6 to 3.8 kernel this might be a simple switch for me. |
@logicminds I don't even know if that commit is in the upstream kernel yet. That link is from the device-mapper tree. Its definitely not in 3.8. |
I'm looking at creating a tool like fstrim that can be used to get back the space. |
This command suspends the pool, extracts all metadata from the metadata pool and then manually discards all regions not in use on the data device. This will re-sparsify the underlying loopback file and regain space on the host operating system. This is required in some cases because the discards we do when deleting images and containers isn't enought to fully free all space unless you have a very new kernel. See: moby#3182 (comment) Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
This command suspends the pool, extracts all metadata from the metadata pool and then manually discards all regions not in use on the data device. This will re-sparsify the underlying loopback file and regain space on the host operating system. This is required in some cases because the discards we do when deleting images and containers isn't enought to fully free all space unless you have a very new kernel. See: moby#3182 (comment) Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
There's always rkt… |
the general bitchy unhelpful snark is no doubt why no-one from upstream cares to give you proper answers. |
That's like Oracle telling a Java developer to use PHP due to a JVM bug. That's also not consistent with the elevator pitch here
I'm sure a lot of people are grateful that Docker took off like it did and that couldn't have happened without volunteering from the community. However, it shouldn't be this hard to admit that it has it's problems too without implicitly dropping the "I'm a upstream contributor so shut up and listen" line whenever someone brings up an unlikable point. |
Wait. I did report an issue, provided the details of my machine and setup,
|
First of all, if you still have this problem, please open a new issue;
You replied on a 3 year old, closed issue; following the discussion above, the original issue was resolved. Your issue may be the same, but needs more research to be sure; the errors you're reporing indicate that it may actually be something else. I really recommend to open a new issue, not commenting on a closed issue
You're not obliged to, but without any information to go on, it's unlikely to be resolved. So, when reporting a bug, please include the information that's asked for in the template:
If you mean "one of the maintainers", please keep in mind that there's almost 24000 issues and PRs, and less than 20 maintainers, many of whom doing that besides their daily job. Not every comment will be noticed especially if it's on a closed issue.
It's the default if aufs, btrfs, and zfs are not supported, you can find the priority that's used when selecting drivers; see daemon/graphdriver/driver_linux.go. It's still above overlay, because unfortunately there's some remaining issues with that driver that some people may be affected by. Automatically selecting a graphdriver is just to "get things running"; the best driver for your situation depends on your use-case. Docker cannot make that decision automatically, so this is up to the user to configure.
Reading back the discussion above, I see that the upstream devicemapper maintainers have looked into this multple times, trying to assist users reporting these issues, and resolving the issue. The issue was resolved for those that reported it, or in some cases, depended on distros updating devicemapper versions. I don't think that can be considered "not caring".
Running on loop devices is fine for getting docker running, and currently the only way to set up devicemapper automatically. For production, and to get a better performance overall, use direct-lvm, as explained in the devicemapper section in the storage driver user guide.
That's out of scope for the installation, really. If you're going to use some software in production, it should be reasonable to assume that you get yourself familiar with that software, and know what's needed to set it up for your use case. Some maintainers even argued if the warning should be output at all. Linux is not a "holding hands" OS (does your distro show a warning that data loss can occur if you're using RAID-0? If you have ports opened in your Firewall?) |
Deeply reluctant as I am, to once again resurrect this ancient thread, there is still no meaningful advice in it about how to work around this issue on an existing machine encountering this issue. This is my best effort at a tldr; for the entire thread; I hope it helps others who find this thread. Issue encounteredYour volume has a significant (and growing) amount of space which is in ResolutionYou're out of luck. Upgrade your file system or see Issue encounteredYour volume has a significant (and growing) amount of space which is in ResolutionYou may be able to reclaim space on your device using standard docker commands. Read http://blog.yohanliyanage.com/2015/05/docker-clean-up-after-yourself/ Run these commands:
If you have nothing listed in any of these, see If you see old stale images, unused containers, etc. you can perform manual cleanup with:
This should reclaim much of the hidden container space in the devicemapper. Blowing docker awayDidn't work? You're out of luck. Your best bet at this point is:
This will destroy all your docker images. Make sure to export ones you want to keep before doing this. Ultimately, please read https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/#configure-direct-lvm-mode-for-production; but I hope this will assist others who find this thread. If you have problems with using advice above open a new ticket that specifically addresses the issue you encounter and link to this issue; do not post it here. |
You can also use nuke-graph-directory.sh. |
Added the workaround to avoid the thin pool exhaustion probably caused by the issue moby/moby#3182 Change-Id: I2edc74189872754383bf9a1f3215490d46e38b27
After removing the files as stated above, I can't start Docker any more: |
Just ran into this issue on CentOS 7.3 and didn't want to debug devmapper issues that exist for more than 3 years, so I followed this DC/OS guide, purged the original packages, switched over to overlayfs and everything seems to work fine now: https://dcos.io/docs/1.7/administration/installing/custom/system-requirements/install-docker-centos/ (had to modify the ExecStart command for docker version 17.03 though -> "dockerd --storage-driver=overlay")
(purging volumes, images and containers didn't help. Deleting stuff in /var/lib/docker led to the problem described by @gdring2 ) |
Running https://docs.docker.com/engine/reference/commandline/system_prune/ |
Well, this is kind of... lame. In my case I found this issue after I uninstalled Docker and deleted the the I found that my system was not reporting the space from deleting The fix to this is to simply reload your file system, in my case I just rebooted and the space was reclaimed. |
I can't believe it's still an issue! come on guys, i'm still having that |
@shahaf600 what version of docker are you running? Also see my comment above; #3182 (comment) Without details there's not much to say about your situation; your case could be caused by a different issue, but resulting in a similar result. |
good |
after buying one of these pieces of garbage and seeing the state of support, i returned it. |
There's your first problem @misterbigstuff...you bought something that's open source? |
and returned it |
Docker claims, via
docker info
to have freed space after an image is deleted, but the data file retains its former size and the sparse file allocated for the device-mapper storage backend file will continue to grow without bound as more extents are allocated.I am using lxc-docker on Ubuntu 13.10:
This sequence of commands reveals the problem:
Doing a
docker pull stackbrew/ubuntu:13.10
increased space usage reporteddocker info
, before:And after
docker pull stackbrew/ubuntu:13.10
:And after
docker rmi 8f71d74c8cfc
, it returns:Only problem is, the data file has expanded to 414MiB (849016 512-byte sector blocks) per
stat
. Some of that space is properly reused after an image has been deleted, but the data file never shrinks. And under some mysterious condition (not yet able to reproduce) I have 291.5 MiB allocated that can't even be reused.My
dmsetup ls
looks like this when there are 0 images installed:And a
du
of the data file shows this:How can I have docker reclaim space, and why doesn't docker automatically do this when images are removed?
The text was updated successfully, but these errors were encountered: