Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce docker images #4637

Open
3 tasks
markus2330 opened this issue Nov 5, 2022 · 16 comments
Open
3 tasks

reduce docker images #4637

markus2330 opened this issue Nov 5, 2022 · 16 comments

Comments

@markus2330
Copy link
Contributor

markus2330 commented Nov 5, 2022

@markus2330 markus2330 added continuous integration triage needed Issue needs clarifications. labels Nov 5, 2022
@markus2330
Copy link
Contributor Author

As 0x6178656c wrote in #4620 (comment):

  • If this image is still relevant this should be documented accordingly
  • If this image is no longer used it should be removed from repository

@markus2330 markus2330 changed the title cleanup unused docker images reduce docker images Nov 13, 2022
@markus2330
Copy link
Contributor Author

We are regularly running into "no space left" problems because of too many Docker images, so I tagged it as urgent and removed the "probably to be removed".

@mpranj any other suggestions other than the two Docker images above?

@markus2330 markus2330 mentioned this issue Nov 13, 2022
31 tasks
@kodebach
Copy link
Member

One thing I noticed about our images is that they are very big. Maybe we can look into making them smaller, that should help with the disk space problems.

@mpranj
Copy link
Member

mpranj commented Nov 13, 2022

We are regularly running into "no space left" problems because of too many Docker images, so I tagged it as urgent and removed the "probably to be removed".

I think this will not save any space.

AFAIK: removing unused images will do nothing to our disk space usage, as the images are not built by the pipeline. The images are build only when needed.
We are actively using many images, so that is a problem.

One thing I noticed about our images is that they are very big.

Would be great if we can do something about this.

Maybe we can add the docker build option --squash to avoid storing multiple layers of the filesystem.
There are always pros and cons, but it's worth a shot.

@kodebach
Copy link
Member

Maybe we can add the docker build option --squash to avoid storing multiple layers of the filesystem.

Wouldn't that mean different images can't share a layer and all images would have to be built entirely from scratch, if there is the tiniest difference?

The images are build only when needed.

So do we actually build new images for every Jenkins run? Is there any kind of auto-cleanup?


Also, since I don't have access to the CI servers: Are we sure that the docker images are the problem? Could there be something else that is eating disk space too, e.g. log files with long retention periods, or artifacts of old builds?

@mpranj
Copy link
Member

mpranj commented Nov 13, 2022

Wouldn't that mean different images can't share a layer and all images would have to be built entirely from scratch, if there is the tiniest difference?

Yes, but I'll test this now to see if there is any difference.
Also, I know this is how it should be true on one machine, but I have a feeling we're not reusing layers anyhow.

Also, since I don't have access to the CI servers: Are we sure that the docker images are the problem?

Yes, pretty sure it is at least the biggest problem. Most other things are cleaned up.

So do we actually build new images for every Jenkins run? Is there any kind of auto-cleanup?

Not for every run, but when they are needed. So images are reused once they are build. They are rebuilt monthly s.t. the packages are updated periodically.

@mpranj
Copy link
Member

mpranj commented Nov 13, 2022

Wouldn't that mean different images can't share a layer and all images would have to be built entirely from scratch, if there is the tiniest difference?

Unfortunately you're right.
I've tested the --squash option and for the case of the build-elektra-fedora-36 images the difference is only 2.16GB vs 2.03GB.

@kodebach
Copy link
Member

Okay, how exactly is our Fedora 36 Image over 2GB in size, when the base fedora:36 image is <60MB (see Docker Hub)? There has to be something in there that we don't need...

Another thing we could do: Remove Java from all images except one, maybe even remove it completely from Jenkins and only test on Cirrus. The JVM should be the same everywhere.

@markus2330 markus2330 removed urgent triage needed Issue needs clarifications. labels Nov 14, 2022
@markus2330
Copy link
Contributor Author

AFAIK: removing unused images will do nothing to our disk space usage, as the images are not built by the pipeline. The images are build only when needed.

Yes, this is why I extended the scope of this issue, the idea was to suggest which used Docker images (probably the least important ones) to remove or how to make them smaller.

Another thing we could do: Remove Java from all images except one, maybe even remove it completely from Jenkins and only test on Cirrus. The JVM should be the same everywhere.

Actually especially Java is very prone to problems in CMake detection and similar. So it is good to have these tests across several distributions.

Btw. the issue seems to be not as urgent as I thought. Used disc space is now: 346G used, 1.5T available, i.e. 20% used, so the problem is simply that running docker prune -af once a month was not enough.

Further suggestions what to reduce nevertheless are welcome. At some point we will need to do the cleanup.

@markus2330
Copy link
Contributor Author

markus2330 commented Nov 14, 2022

Also, since I don't have access to the CI servers: Are we sure that the docker images are the problem? Could there be something else that is eating disk space too, e.g. log files with long retention periods, or artifacts of old builds?

After running docker prune -af on a7 the disc space usage goes from 100% to less than 20%.

@kodebach
Copy link
Member

the idea was to suggest which used Docker images (probably the least important ones) to remove or how to make them smaller.

I see we have 4 different Debian Bullseye images? Why? I get the minimal image to test without installing dependencies, but the rest are probably wasting space. The same goes for Debian Buster.

Also if docker image prune -af (or even docker system prune) cleaned up > 1TB auf space, I would really be interested in what exactly was removed. e.g. docker image ls before and afterwards would be interesting.

Additionally, we can probably run docker image prune (without -a) much more often. It should not remove anything we need.

@mpranj
Copy link
Member

mpranj commented Nov 14, 2022

Btw. the issue seems to be not as urgent as I thought. Used disc space is now: 346G used, 1.5T available, i.e. 20% used, so the problem is simply that running docker prune -af once a month was not enough.

Also if docker image prune -af (or even docker system prune) cleaned up > 1TB auf space

Seriously doubt this happened. Usually it cleans about 100-200GB.
Maybe we should prune -af weekly? prune -f is run daily, prune -af is run monthly
Note that deleting all images also means that the current ones need to be fetched from our docker registry, which has a rather slow connection.

What machine are you talking about?

On a7 we store the:

  • docker registry on spinning 2TB disks.
  • /var/lib/docker and JenkinsHome and other stuff on the 250GB SSD, which is usually what runs out of space.

What might be a problem:
The build agents keep current images which they need. (so far, everything is OK)
When a Dockerfile is changed, a new version of this image is built and the build agents retrieve this image. Now we have two versions of this image per build agent.
The issue worsens when multiple PRs change images multiple times.

@markus2330
Copy link
Contributor Author

I see we have 4 different Debian Bullseye images? Why?

To also test cmake exclusion of modules. Probably we should make these images build upon each other to use less space?

Maybe we should prune -af weekly?

Yes, sounds like the easiest solution for now. Is there some way to only cleanup the images that weren't used for a week?

What machine are you talking about?

In #4637 (comment) I was talking about a7 of the recent incident #160 (comment).

@kodebach
Copy link
Member

To also test cmake exclusion of modules. Probably we should make these images build upon each other to use less space?

Building the images on top of each other would definitely help.

There's probably a few other things we can do. Like reducing the number of RUNs to reduce layers, or check that we're not installing e.g. some GUIs or other unnecessary stuff.

Is there some way to only cleanup the images that weren't used for a week?

Yes, the --filter argument can be used with a timestamp. See e.g. this page

@4ydan
Copy link
Contributor

4ydan commented Jun 22, 2023

Fedora32 Docker image analysis

So I did a small investigation on the "scripts/docker/fedora/32/Dockerfile" image.
I analyzed its layers and most of the size comes from all the packages installed.
The whole image is 2.61GB and around 2.4GB are packages.
image

Top 10 packages by size.

Package Size (MB)
golang-bin-1.14.15-3.fc32.x86_64 255.98
java-11-openjdk-headless-11.0.11.0.9-0.fc32.x86_64 170.76
java-1.8.0-openjdk-headless-1.8.0.292.b10-0.fc32.x86_64 117.47
clang-libs-10.0.1-3.fc32.x86_64 92.07
gcc-10.3.1-1.fc32.x86_64 81.71
llvm-libs-10.0.1-4.fc32.x86_64 78.23
glibc-debuginfo-2.31-6.fc32.x86_64 76.42
mesa-dri-drivers-20.2.3-1.fc32.x86_64 65.74
glibc-debuginfo-common-2.31-6.fc32.x86_64 57.20
python27-2.7.18-8.fc32.x86_64 54.59

Improvements

Adding weak_deps=False option

dnf install --setopt=install_weak_deps=False

--setopt=install_weak_deps=False: This flag disables the installation of weak dependencies, which can help reduce the number of unnecessary packages installed. Equivalent to --no-install-recommends in apt-get.

Result

Adding this dnf option reduced the image size by ~15%.
image

Maybe it might be interesting to use some container registry like ghcr.io to reduce duplicate code and build some base images, that other dockerfiles could build upon.

@markus2330
Copy link
Contributor Author

Thank you for the investigation. Yes, please add this option(s).

4ydan added a commit to 4ydan/libelektra that referenced this issue Jul 3, 2023
@4ydan 4ydan mentioned this issue Jul 3, 2023
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants