Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce GH cache usage during build workflow #2891

Open
giggsoff opened this issue Oct 27, 2022 · 9 comments
Open

Reduce GH cache usage during build workflow #2891

giggsoff opened this issue Oct 27, 2022 · 9 comments
Labels
enhancement New feature or request

Comments

@giggsoff
Copy link
Contributor

Use case

We split build workflow into two jobs: packages (to build packages required to build EVE-OS image) and eve (to compose final EVE-OS image). We share the data between jobs using GitHub cache. The problem is in limited size of the cache. It uses some kind of FIFO and removes old stored caches. In case of several PRs comes to build concurrently we loose the cache between jobs.
I observed that during packages job we fill the cache with images that are already published, which is sub-optimal as we can pull them directly in eve job. We try to split our changes and modify only packages should be modified in one PR. We should not fill the cache with unneeded data.

Describe the solution you'd like

To reduce cache usage I see three possible solutions:

  1. Introduce some kind of cleanup for published images in linuxkit. We have cache clean command in linuxkit to remove entire cache, possibly we can add keep-unpublished flag to clean only published images.
  2. Introduce optional flag in linuxkit to validate only and to not pull the image during pkg build. In that case we still need to pull the images we depends on (if image B depends on image A, A was published and we want to build B, we still need A to be pulled into linukit cache). But we do not want to have published images in cache regardless of dependencies.
  3. Add logic on EVE-OS side to check if image already exists in external OCI registry before build (again we should somehow solve dependency problem).

cc @deitch

@deitch
Copy link
Contributor

deitch commented Oct 30, 2022

Introduce some kind of cleanup for published images in linuxkit. We have cache clean command in linuxkit to remove entire cache, possibly we can add keep-unpublished flag to clean only published images

I am not greatly enamoured of the idea of adding too much logic to the cache clean command. There are other tools that can check if an image is published or not, and then we can use the cache clean.

I could be convinced otherwise, however.

But our issue here really isn't clearing stuff out of the cache as opposed to not getting it in in the first place. Our specific problem is that we build or pull all packages in the build step, but don't really need them. The fact that we download them is just a secondary artifact of the fact that lkt pkg build will either build or download. In the build job, we don't really want it there unless we have to build it, i.e. it has changed.

That brings us to the second option you raised:

Introduce optional flag in linuxkit to validate only and to not pull the image during pkg build

This is the most efficient solution you propose. And it is fairly easy to implement. The question is, does it make sense in the context of lkt?

  • lkt pkg build currently means: "I want the package built in my cache. Whether you pull it down to build it, I don't care, but make it happen or fail." This has a simple logic and fits normal use cases.
  • lkt pkg build --validate-only (or --no-pull or whatever) is useful for us, but does it make sense in the context of normal linuxkit usage? What is the equivalent of "I want this in my cache, pull it or build it" of the current behaviour? The answer could very well be, "we just proved the use case". Or perhaps it is, "I want the package to exist either in the registry or my cache." I have a hard time seeing it, though.

One side issue you raised is:

we still need to pull the images we depends on (if image B depends on image A, A was published and we want to build B, we still need A to be pulled into linukit cache). But we do not want to have published images in cache regardless of dependencies

No, it isn't an issue. When we use buildkit to build, and it sees that B depends on A, it looks for A in the registry, finds it and uses it inside the buildkit container. Only if it cannot find it in the registry does it fail. linuxkit itself has wired into buildkit to override that by finding it in its local cache, but if it doesn't find it in local cache, buildkit does its usual thing and pulls from registry.

You can test it easily. Create a pkg, have it depends on something public, and do lkt pkg build. Despite it not being in the lkt cache, it will build fine, the output will be in lkt cache, but the dependency (A) will not.

Which brings us to your third option:

Add logic on EVE-OS side to check if image already exists in external OCI registry before build

This is the equivalent of your second option.

If your second option is: "build only if it is not already published via: lkt pkg build --no-pull pkg/foo", then this is "build only if it is not already published via:

TAG=$(lkt pkg show-tag pkg/foo)
[ exists-in-registry ] || lkt pkg build pkg/foo

I think I prefer this last option over any other. Only call lkt pkg build if you want to populate the cache. On the other hand, we would need some way to implement [ exists-in-registry ]. There are tools to do it (docker inspect does not work on remote images, only locally downloaded, although you could do docker pull to see if it exists, that is a bit overkill) , but the easiest tool at hand is... linuxkit. So we could do:

lkt pkg show-published pkg/foo

or even

lkt pkg show-tag --published pkg/foo

but I am not convinced that buys us all that much more than just going all the way to: lkt pkg build --validate-only.

Maybe we should do both?

@deitch
Copy link
Contributor

deitch commented Oct 30, 2022

Reading this again, I really am having a hard time reconciling lkt pkg build (the name of the command is "build") with, "if you found it in a registry, do nothing. I can reconcile it with "if you found it in a registry, pull it down", because the purpose of build is "make sure it is ready in my cache." This is no different than make foo where it means, "ensure I have foo here, either because it is there and unchanged, or because you built it."

But "build it unless it is in some registry, in which case leave it alone" is just strange for a command called build.

@giggsoff
Copy link
Contributor Author

giggsoff commented Nov 3, 2022

Seems we exceeded the cache limit in our runs: https://github.com/lf-edge/eve/actions/runs/3387572937/jobs/5628487487#step:17:3
Warning: Failed to save: Cache size of ~10630 MB (11146442317 B) is over the 10GB limit, not saving cache.

@deitch
Copy link
Contributor

deitch commented Nov 3, 2022

I'm working on those fixes. I will prioritize over the next 24 hours.

@giggsoff
Copy link
Contributor Author

giggsoff commented Nov 3, 2022

Ah, sorry, I did not try to push you. Actuality it is not great that we exceed the limit of this kind, because we will not be able to rebuild the whole EVE-OS for example while modifying of some base package (eve-alpine) and all dependencies. Probably we can avoid such PRs and split them.

@deitch
Copy link
Contributor

deitch commented Nov 3, 2022

We still need to fix it. But that will become a big issue.

@giggsoff
Copy link
Contributor Author

giggsoff commented Nov 3, 2022

Well, It was another issue: #2904

@deitch
Copy link
Contributor

deitch commented Nov 9, 2022

@giggsoff several linuxkit updates have been merged in.

  1. pkg build now checks if what is in the registry matches what will be built and does not pull it down unless you add --pull. Before it checked and pulled to cache; now that is only if you explicitly ask for it. That should keep the cache smaller.
  2. cache clean --published-only will remove any packages from the cache that are identical to what is in the registry. It is fairly conservative about it, removing only what is precisely identical, but that should help us.

Do you want to update lf-edge/eve/Makefile to use the latest commit, and the CI pipeline to do a clean?

@giggsoff
Copy link
Contributor Author

giggsoff commented Nov 9, 2022

Thank you!
Will work on integration them here.

@dautovri dautovri added the enhancement New feature or request label Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants