Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Improved garbage collection defaults #7392

Open
jedevc opened this issue May 16, 2024 · 0 comments
Open

✨ Improved garbage collection defaults #7392

jedevc opened this issue May 16, 2024 · 0 comments

Comments

@jedevc
Copy link
Member

jedevc commented May 16, 2024

Spinning this out of a private discord discussion here.

We inherit buildkit's default gc policies - which is to try and keep cache usage under 10% of available disk.

This is not actually great for a lot of cases. Here's a few dagger scenarios, and what kind of cache policies we might want for these:

  • Using dagger on a personal linux machine
    • I want dagger to use quite a lot of space - I use dagger for most of my dev work, so maybe it could use 20% of my disk (on the scale of several hundred gigs of data).
  • Using dagger in docker-desktop/lima/etc on a personal mac/windows/linux machine
    • Because this is a VM, there's a disk provisioned for it - that's usually quite small (the docker desktop seems unreasonably so sometimes).
    • I want dagger to use most of the available disk storage (but not all), maybe something like 50% (this works out to mostly the same as the above option - but now there are two dials to turn, the docker desktop disk size dial, and the dagger disk size dial).
  • Using dagger in a dedicated CI worker
    • I want dagger to use as much storage as it requires - I want the limit to be something like 80-90%.

At the moment, we just have a flat "10%" for everything - which is pretty bad, especially for the last scenario.

Part of this is related to documentation + usability of configuring cache policy in the first place:

However, this is still different - we should have a way of making sure that the defaults are good for 90% of users. There's unfortunately no easy way to detect what kind of case we're in.

I think this makes sense more sense in the context of #5583 though. When we start a dagger engine with a driver, we can allow specifying the cache policies for that engine, and potentially automatically pick some reasonable defaults (e.g. based on the presence of GITHUB_ env vars, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant