Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache keys #12

Open
FiloSottile opened this issue Nov 20, 2020 · 4 comments
Open

Cache keys #12

FiloSottile opened this issue Nov 20, 2020 · 4 comments

Comments

@FiloSottile
Copy link
Contributor

This is an awesome resource, thank you!

I noticed you use ${{ hashFiles('**/go.sum') }} as a cache key for the module cache. I am not sure that does what it is intended for, since the go.sum file does not contain a hash of the module itself, and it might be missing from a module with no dependencies. If hashFiles includes the file names in the key, you could just use the go.mod files instead, but even that would exclude modules that did not update.

@mvdan
Copy link
Owner

mvdan commented Nov 20, 2020

Thanks for raising this issue! I've actually been thinking about cache keys too, lately.

We don't want the cache to be strict to the point of hashing the entire module and its source code, because then changing a single line would presumably result in an entirely separate cache key.

Really, the reason to not use a single cache key is so that we can store multiple "versions" of the cache in parallel. Right now we do one per OS, and one per go.sum content. This was originally copied from the official docs. I'm actually not sure that splitting the caches per OS is really necessary, because hopefully the cache behaves exactly the same across OSs and filesystems.

I'm also not entirely sure that including go.sum helps that much, because in general small variations in a go.sum should not result in a wildly different cache content (module download cache and/or build cache) that benefits from being stored separately.

Something to note is that restore-keys lets us reuse any inexact cache key match if an exact match doesn't exist, so even if go.sum doesn't match a previous cache key, we'd use another existing cache entry.

@FiloSottile
Copy link
Contributor Author

It's occurring to me that maybe **/go.sum matches inside the main module, not inside the cache, which actually makes total sense, since the module cache of a tidy module should match its go.sum. (I think it's the ** that confused me.) So maybe this issue is moot, but reusing inexact cache keys also sounds like a nice improvement.

@mvdan
Copy link
Owner

mvdan commented Nov 20, 2020

Yeah, I don't believe go.sum files inside the cache are part of the key. It would be confusing for the cache value to be part of its cache key.

Reusing inexact cache keys is already in place, for what it's worth. So we already do share slightly different go.sum caches anyway.

The difference, I think, is how wasteful we want to be with GitHub's cache space allowance. If we store one cache entry per go.sum hash, we'll end up with more cache entries, getting closer to the 5GiB total limit at some point and causing evictions. On the other hand, using less specific cache keys (such as just the OS name) would use fewer keys, and keep replacing them.

@mvdan
Copy link
Owner

mvdan commented Mar 2, 2022

FYI, when persisting GOCACHE via actions/cache, I've now included the Go version as part of the suggested cache key. See 4b75472.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants