Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Improved garbage collection defaults #7392

Open
jedevc opened this issue May 16, 2024 · 5 comments
Open

✨ Improved garbage collection defaults #7392

jedevc opened this issue May 16, 2024 · 5 comments

Comments

@jedevc
Copy link
Member

jedevc commented May 16, 2024

Spinning this out of a private discord discussion here.

We inherit buildkit's default gc policies - which is to try and keep cache usage under 10% of available disk.

This is not actually great for a lot of cases. Here's a few dagger scenarios, and what kind of cache policies we might want for these:

  • Using dagger on a personal linux machine
    • I want dagger to use quite a lot of space - I use dagger for most of my dev work, so maybe it could use 20% of my disk (on the scale of several hundred gigs of data).
  • Using dagger in docker-desktop/lima/etc on a personal mac/windows/linux machine
    • Because this is a VM, there's a disk provisioned for it - that's usually quite small (the docker desktop seems unreasonably so sometimes).
    • I want dagger to use most of the available disk storage (but not all), maybe something like 50% (this works out to mostly the same as the above option - but now there are two dials to turn, the docker desktop disk size dial, and the dagger disk size dial).
  • Using dagger in a dedicated CI worker
    • I want dagger to use as much storage as it requires - I want the limit to be something like 80-90%.

At the moment, we just have a flat "10%" for everything - which is pretty bad, especially for the last scenario.

Part of this is related to documentation + usability of configuring cache policy in the first place:

However, this is still different - we should have a way of making sure that the defaults are good for 90% of users. There's unfortunately no easy way to detect what kind of case we're in.

I think this makes sense more sense in the context of #5583 though. When we start a dagger engine with a driver, we can allow specifying the cache policies for that engine, and potentially automatically pick some reasonable defaults (e.g. based on the presence of GITHUB_ env vars, etc).

@gtzo-anchorage
Copy link

Just chiming in to say that this puzzled me a fair bit as a first-time Dagger user looking to implement on dedicated CI runners. A more polished cache control would be extremely useful!

@Scalahansolo
Copy link

One thing to call out here.... If running Dagger in K8s, Dagger will try to keep it's memory cache around the requested memory, but has the potential to spike well above that. Our memory limit is set really high to account for spikes, but GC can be psuedo tuned by setting that value.

@marcosnils
Copy link
Contributor

@jedevc given that this has bitten quite some people and it might result in a poor experience when running dagger, WDYT about changing the defaults from 10 to something considerably higher like 75%?

It's somehow odd that the defaults are so low, specially when docker doesn't have any gc mechanism to prune its cache volumes and/or images.

Copy link
Member Author

jedevc commented Jun 6, 2024

Yeah, that feels like a reasonable option for the short-term honestly.

Just bumping it up should make for a better general experience - but I still think there's some complexity with how to manage different environments, so that would only address the surface level of this IMO.

@marcosnils
Copy link
Contributor

  • but I still think there's some complexity with how to manage different environments, so that would only address the surface level of this IMO.

I totally agree with this. Maybe we can bump the default GC retention % for now and move this discussion to #5583 as you initially suggested?

marcosnils added a commit to marcosnils/dagger that referenced this issue Jun 6, 2024
stopgap fix for dagger#7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jun 6, 2024
stopgap fix for dagger#7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jun 6, 2024
stopgap fix for dagger#7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
marcosnils added a commit to marcosnils/dagger that referenced this issue Jun 6, 2024
stopgap fix for dagger#7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
marcosnils added a commit that referenced this issue Jun 6, 2024
stopgap fix for #7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
sipsma pushed a commit to sipsma/dagger that referenced this issue Jun 6, 2024
stopgap fix for dagger#7392. Increasing the GC keep size to more reasonable defaults
so users don't get cache evicted so aggresively.

Signed-off-by: Marcos Lilljedahl <marcosnils@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants