Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making sha256 or sha1 sum for http_archive optional #576

Open
dvtkrlbs opened this issue Feb 23, 2024 · 3 comments
Open

Making sha256 or sha1 sum for http_archive optional #576

dvtkrlbs opened this issue Feb 23, 2024 · 3 comments

Comments

@dvtkrlbs
Copy link
Contributor

I have been trying to create a tool like reindeer but for go modules. But I've hit a barrier where creating the http_archive rules for the required modules requires the sha sum of the file it is downloading. However there is no easy way to get the hash of the files without downloading them. go.sum files or go sum db has some hashes but they are not the hash of the zip file rather a directory hash function using the actual contents of the file. One way to do this would be to download everything with the tool calculate the hashes and then generate the rules for it. Is there a way to circumvent this ?

@dvtkrlbs
Copy link
Contributor Author

Another way could be not using a GOPROXY and using the default VCS resolving algorithm go uses. This is much more complicated than using the user supplied or default GOPROXY but would probably circumvent unnecessary download.

@thoughtpolice
Copy link
Contributor

thoughtpolice commented Feb 29, 2024

FWIW, I'm not a Go programmer, but by default I don't think there's a way around this. You have to download the files and calculate the proper hash manually as part of the generation step. Many other tools work this way in some form or another, not just in Buck2-land but elsewhere.

But, to work around this today, instead of using ctx.actions.download_file you instead could try using ctx.actions.run("curl -L -o ...") to download the file through a command, then untar it, then check the file hash. I think this won't work with Remote Execution (your RBE runners probably have their network cut off), but it's close to what you want.


I think that to support this Buck would need a feature that the Nix package manager calls "fixed output derivations" or "FODs". These are build rules where you don't give the hash of the downloaded file itself, but instead give a hash of the expected output files that some arbitrary commands produce.

For example, if you download https://example.com/foo-0.1.tar.gz, you can give the hash of this file directly. But this hash is fragile, because the server admin may recompress the file with different compression settings. But the content of the tarball would remain the same in any case. Therefore, you want to download a file, then run a command to extract it, then check the hash of the output files. This feature also means you can e.g. completely change the download URL to something arbitrary and it will still work and get cache hits/early cut-off, assuming that the contents are in fact the same.

I don't know what FODs would look like in Buck2 at all.

@JakobDegen
Copy link
Contributor

@thoughtpolice you could imagine doing this in a dynamic_outputs like way. Imagine essentially the exact same API that dynamic_outputs has today, except that on the call to dynamic_outputs you also have to specify the hash all of the output artifact that you're later going to bind within the dynamic_outputs. In exchange for that, the analysis_context in the dynamic function could give you access to a version of ctx.actions.download_file that doesn't require specifying the hash. That's really only part of what I understand FODs to provide, you'd probably also want such a construct to change caching behavior in other ways that is less obvious to really see the benefits.

I agree with Austin though, your best option right now is to have a locally run action that invokes curl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants