Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Idea] Support Starlark overlays without patches #1566

Open
aaronmondal opened this issue Mar 1, 2024 · 11 comments · May be fixed by #2046
Open

[Idea] Support Starlark overlays without patches #1566

aaronmondal opened this issue Mar 1, 2024 · 11 comments · May be fixed by #2046

Comments

@aaronmondal
Copy link

At the moment all modules are overlaid with patches and those patches are hashed and tracked in source.json.

Many modules probably just need a MODULE.bazel file and some buildfiles. I think it would be nice UX improvement if those files could be put into the BCR directly and have some mechanism to "automatically" overlay these Bazel-specific files.

Since we already have MODULE.bazel in the BCR without patches, it might be viable to have some sort of overlay mechanism, maybe even an overlays directory that adds Starlark files directly on top of the external repository.

Such an approach would be somewhat similar to how it feels like to work in nixpkgs: Have the build overlays as regular source code and only use patches for things that are "true" functionality-changing patches.

I believe this could make it nicer to browse the BCR and get inspiration. It would also make it easier to spot functional changes and would potentially reduce the friction of having to generate patch files all the time.

@alexeagle
Copy link
Contributor

+1 it's really painful to have to encode new files as patches

@fmeum
Copy link
Contributor

fmeum commented Apr 25, 2024

We could add a remote_files attribute to http_archive that takes a dict from repo-relative paths to URLs and downloads the files from these URLs to the given path (for simplicity, assuming that the original file didn't exist). We would need an additional remote_files_integrity dict to store the hashes.

Then registries could start populating this attribute in their source.json files, but they would only be compatible with Bazel versions with support for the new attributes. I need to check what an old Bazel version would do with such a source.json - hopefully it would fail rather than silently dropping the attributes. But we could always have a bazel_compatibility requirement for such modules.

@meteorcloudy @Wyverald @SalmaSamy What do you think?

@meteorcloudy
Copy link
Member

SGTM, PR welcome. And we could backport the change to 6.x and 7.x.

@fzakaria
Copy link

I am interested in working on this.
I think it would be a big QoL improvement.

@fmeum if could point me to some source files or maybe a PR that did something similar in changes to sources.json to mimic I can try that.

@fzakaria
Copy link

I want to add not only is it annoying to submit patch files but the BUILD files aren't searchable on GitHub search using the starlark language filter. That would help expose more samples.

@fzakaria
Copy link

Okay here is my research:

  1. Augment https://github.com/bazelbuild/bazel/blob/master/tools/build_defs/repo/http.bzl#L235 to add a new attribute for an overlay directory (auto) or the dict files (manual)
  2. Augment https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/ArchiveRepoSpecBuilder.java#L28
  3. Augment https://github.com/bazelbuild/bazel/blob/618c0abbfe518f4e29de523a2e63ca9179050e94/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/IndexRegistry.java#L189 to handle support for the new fields in the source.json

Looks like (1) can be starlark only.
(2) and (3) need to be changed for bzlmod but I'm not 100% certain of their innerworkings.

@fmeum
Copy link
Contributor

fmeum commented Apr 26, 2024

@fzakaria That's highly appreciated! Your research looks solid, most of this can indeed be done in Starlark and I can also help with the Java parts.

For starters, I would suggest to focus on the Starlark part and augment http_archive with two new attributes:

  • remote_file_urls of type string_list_dict, mapping a relative path of a file to its source URLs, thus supporting mirrors. Remote patches currently don't support mirrors. @meteorcloudy Was that intentional?
  • remote_file_integrity of type string_dict, mapping the same relative path to the integrity value.

I think that attribute structure is going to be simpler than a full-blown overlay JSON, with a very similar experience: In the BCR, we could just have a files sibling directory of patches with files neatly laid out according to their relative path. The usual tooling (//tools:update_integrity in the BCR) could then add the integrity values to source.json in whatever format we want.

When implementing the download logic for the remote files, I would suggest to keep the following in mind:

  • Downloads of multiple files can be parallelized via the block parameter of download. This can't be cherry-picked into Bazel 6 though.
  • Before downloading the files, we should make sure that their relative paths can't result in path traversal outside the repository directory. We should forbid .. segments and also verify that the absolute path obtained by resolving the relative one against the repository directory still lies under the repository directory. As far as I know download_and_extract can't create symlinks to absolute paths that are directories, but we should also doublecheck that this doesn't allow malicious archives to place files anywhere on the system.

fzakaria added a commit to fzakaria/bazel that referenced this issue Apr 26, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and
remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be
overlaid ontop of an http_archive. The goal of such functionality would
be useful for BCR since the BUILD & WORKSPACE files need no longer be
stored as patch files.

This means we could probably deprecate `build_file` since that could be
referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <mark.r.williams@google.com>
@fzakaria
Copy link

@fmeum thank you for the advice -- I was able to start bazelbuild/bazel#22155
It is working. I finished 90% -- I can add remaining tests and finish it but I wanted some early feedback.

fzakaria added a commit to fzakaria/bazel that referenced this issue Apr 26, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and
remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be
overlaid ontop of an http_archive. The goal of such functionality would
be useful for BCR since the BUILD & WORKSPACE files need no longer be
stored as patch files.

This means we could probably deprecate `build_file` since that could be
referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>
fzakaria added a commit to fzakaria/bazel that referenced this issue Apr 30, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and
remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be
overlaid ontop of an http_archive. The goal of such functionality would
be useful for BCR since the BUILD & WORKSPACE files need no longer be
stored as patch files.

This means we could probably deprecate `build_file` since that could be
referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>
copybara-service bot pushed a commit to bazelbuild/bazel that referenced this issue May 10, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be overlaid ontop of an http_archive. The goal of such functionality would be useful for BCR since the BUILD & WORKSPACE files need no longer be stored as patch files.

This means we could probably deprecate `build_file` since that could be referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>

CC @fmeum

Closes #22155.

PiperOrigin-RevId: 632594203
Change-Id: I6310093482c5c58537ed6dbe4ff90bafdbd696ff
bazel-io pushed a commit to bazel-io/bazel that referenced this issue May 10, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be overlaid ontop of an http_archive. The goal of such functionality would be useful for BCR since the BUILD & WORKSPACE files need no longer be stored as patch files.

This means we could probably deprecate `build_file` since that could be referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>

CC @fmeum

Closes bazelbuild#22155.

PiperOrigin-RevId: 632594203
Change-Id: I6310093482c5c58537ed6dbe4ff90bafdbd696ff
github-merge-queue bot pushed a commit to bazelbuild/bazel that referenced this issue May 10, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and
remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be
overlaid ontop of an http_archive. The goal of such functionality would
be useful for BCR since the BUILD & WORKSPACE files need no longer be
stored as patch files.

This means we could probably deprecate `build_file` since that could be
referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>

CC @fmeum

Closes #22155.

PiperOrigin-RevId: 632594203
Change-Id: I6310093482c5c58537ed6dbe4ff90bafdbd696ff

Commit
e01509f

Co-authored-by: Farid Zakaria <fmzakari@google.com>
@fzakaria
Copy link

fzakaria commented May 11, 2024

@fmeum sent me next steps I am capturing here:

You will have to extend this file: https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/bazel/bzlmod/IndexRegistry.java;l=131;drc=a87a8db23885535232ace3f64ac45e595c95d2f5

In particular the getRepoSpec method. You can mostly mimic the logic for patches, I think.

Once that lands, we can update the tooling in the bazel-central-registry repo.

Kila2 pushed a commit to Kila2/bazel that referenced this issue May 13, 2024
This issue adds support necessary to tackle
bazelbuild/bazel-central-registry#1566

Add two new attributes to http_archive: remote_file_urls and remote_file_integrity.

The purpose of these two attributes is to allow files to effectively be overlaid ontop of an http_archive. The goal of such functionality would be useful for BCR since the BUILD & WORKSPACE files need no longer be stored as patch files.

This means we could probably deprecate `build_file` since that could be referenced as a file:// url in the remote_file_urls attribute.

Co-authored-by: Mark Williams <markrwilliams@google.com>

CC @fmeum

Closes bazelbuild#22155.

PiperOrigin-RevId: 632594203
Change-Id: I6310093482c5c58537ed6dbe4ff90bafdbd696ff
fzakaria added a commit to fzakaria/bazel that referenced this issue May 13, 2024
This is a continuation of bazelbuild#22155 that adds the newly added
'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay
files rather than use patch files when providing MODULE/WORKSPACE/BUILD
files.

bazelbuild/bazel-central-registry#1566 has a
good discussion of the rationale.
@fmeum
Copy link
Contributor

fmeum commented May 13, 2024

While @fzakaria is close to getting overlay support into Bazel, I noticed something that could make our lives easier today: It's possible to reuse patches from other module versions, which could be used to simplify reviews of C++ modules. See https://github.com/bazelbuild/bazel-central-registry/pull/2024/files#diff-90fa618704d4146385f96570f2033f70b09f59ae5c630825ea0dde357520ebd5R5 for an example.

@fzakaria fzakaria linked a pull request May 16, 2024 that will close this issue
@fzakaria
Copy link

fzakaria commented May 16, 2024

Final piece is up for PR #2046 2046

copybara-service bot pushed a commit to bazelbuild/bazel that referenced this issue May 23, 2024
This is a continuation of #22155 that adds the newly added 'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay files rather than use patch files when providing MODULE/WORKSPACE/BUILD files.

bazelbuild/bazel-central-registry#1566 has a good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>

Closes #22349.

PiperOrigin-RevId: 636682112
Change-Id: Ief070985598a7c0f427a98cd3daeb69a0984f7be
bazel-io pushed a commit to bazel-io/bazel that referenced this issue May 23, 2024
This is a continuation of bazelbuild#22155 that adds the newly added 'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay files rather than use patch files when providing MODULE/WORKSPACE/BUILD files.

bazelbuild/bazel-central-registry#1566 has a good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>

Closes bazelbuild#22349.

PiperOrigin-RevId: 636682112
Change-Id: Ief070985598a7c0f427a98cd3daeb69a0984f7be
github-merge-queue bot pushed a commit to bazelbuild/bazel that referenced this issue May 23, 2024
This is a continuation of #22155 that adds the newly added
'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay
files rather than use patch files when providing MODULE/WORKSPACE/BUILD
files.

bazelbuild/bazel-central-registry#1566 has a
good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>

Closes #22349.

PiperOrigin-RevId: 636682112
Change-Id: Ief070985598a7c0f427a98cd3daeb69a0984f7be

Commit
c4167e3

Co-authored-by: Farid Zakaria <fmzakari@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants