Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub Action Bazel caching does not seem to work rules_proto_grpc #161

Open
prm-dan opened this issue Nov 25, 2021 · 4 comments
Open

GitHub Action Bazel caching does not seem to work rules_proto_grpc #161

prm-dan opened this issue Nov 25, 2021 · 4 comments
Labels
blocked-external Waiting on external issue/release bug Something isn't working

Comments

@prm-dan
Copy link

prm-dan commented Nov 25, 2021

Description

Hi. When I use rules_proto_grpc with a GitHub Action cache, it looks like the protobuf library is rebuilt each time (which is most of the build). I'm not sure why. I'm guessing the protobuf compiler output directory is not cached. I'm guessing it's saved to a special directory. I tried looking through rules_proto_grpc to see if the code was doing anything special but the code base is large. One weird thing is that I did not hit this issue when building protos directly (not using rules_proto_grpc).

Q - what's the directory that the compiled files are in for rules_proto_grpc? I can try to hack around this to cache the outputs.

I'm using rules_proto_grpc-3.1.0.

Analyzing: 68 targets (127 packages loaded, 8780 targets configured)
Analyzing: 68 targets (127 packages loaded, 8780 targets configured)
INFO: Analyzed 68 targets (182 packages loaded, 9284 targets configured).
INFO: Found 68 targets...
[0 / 3] [Prepa] BazelWorkspaceStatusAction stable-status.txt
[26 / 220] Compiling src/google/protobuf/wire_format_lite.cc; 0s linux-sandbox ... (2 actions, 1 running)
[43 / 220] Compiling src/google/protobuf/generated_message_table_driven_lite.cc; 0s linux-sandbox ... (2 actions, 1 running)
[59 / 220] Compiling src/google/protobuf/compiler/csharp/csharp_reflection_class.cc; 0s linux-sandbox ... (2 actions running)
[72 / 220] Compiling src/google/protobuf/compiler/cpp/cpp_field.cc; 1s linux-sandbox ... (2 actions running)
[93 / 220] Compiling src/google/protobuf/compiler/objectivec/objectivec_enum_field.cc; 0s linux-sandbox ... (2 actions running)
[112 / 220] Compiling src/google/protobuf/descriptor.cc; 1s linux-sandbox ... (2 actions running)
[135 / 220] Compiling src/google/protobuf/compiler/java/java_message_builder_lite.cc; 1s linux-sandbox ... (2 actions running)
[207 / 291] GoCompilePkg external/org_golang_google_protobuf/reflect/protoreflect/protoreflect.a; 0s linux-sandbox ... (2 actions running)
[351 / 502] Compiling src/google/protobuf/compiler/cpp/cpp_file.cc [for host]; 2s linux-sandbox ... (2 actions running)
[399 / 502] Compiling src/google/protobuf/extension_set_heavy.cc [for host]; 0s linux-sandbox ... (2 actions running)
[442 / 502] Compiling src/google/protobuf/descriptor.pb.cc [for host]; 3s linux-sandbox ... (2 actions running)
[526 / 570] Building external/com_google_protobuf/libstruct_proto-speed.jar (1 source jar); 2s multiplex-worker ... (2 actions running)
INFO: From ProtoCompile proto/event/event_pb2.py:

Here's the cache step of my GitHub Action.

      - name: Cache Bazel
        uses: actions/cache@v2.1.6
        env:
          cache-name: bazel-cache
        with:
          path: |
            ~/.cache/bazelisk
            ~/.cache/bazel
          # To write a new cache, we need to have a unique key.
          # TODO - try to get "-${{ hashFiles('WORKSPACE', '**/BUILD.bazel') }}"" to work
          key: ${{ runner.os }}-${{ env.cache-name }}-${{ github.ref }}
          # We use restore-keys to find a recent cache to reuse.
          restore-keys: 
            ${{ runner.os }}-${{ env.cache-name }}-
@aaliddell
Copy link
Member

aaliddell commented Nov 25, 2021

I see this sometimes too and it's really difficult to track. Essentially Bazel thinks something has changed that requires rebuilding protobuf and the reasons can be many. First off, try running with --incompatible_strict_action_env. This flag further restricts how changes in env vars can break your caching, but likely won't solve the problem.

The de-facto way of tracing this sort of thing is using --execution_log_json_file and tracing the exact action that causes the action graph to differ. However, the tooling around this log is pretty lacking last time I tried, so this is not a simple process to figure out cause vs effect. See this doc, which explains the process (even though you arent using remote execution, the process is the same): https://docs.bazel.build/versions/main/remote-execution-caching-debug.html. Using bazel aquery may also be a better alternative, if it has the necessary info.

Realistically, if it's protobuf getting spuriously rebuilt, it's probably an issue with the protobuf repo rather than here. Having a check of their issues, this one may be related: protocolbuffers/protobuf#6886. Basically, by using that use_default_shell_env option they are punching a hole in the Bazel sandbox and allowing any env var change to trash the cache. Since GitHub actions (and other CI systems) intentionally have different env vars on every run, this may be why you see this so badly there.

Therefore, an interesting experiment to run would be to manually strip your env vars to a bare minimum before calling Bazel to build. Something like (pinched from here):

env -i HOME="$HOME" LC_CTYPE="${LC_ALL:-${LC_CTYPE:-$LANG}}" PATH="$PATH" USER="$USER" bazel ...

@aaliddell aaliddell added the bug Something isn't working label Nov 25, 2021
@aaliddell aaliddell added the blocked-external Waiting on external issue/release label Aug 24, 2022
@alexeagle
Copy link
Contributor

I've sent bazelbuild/rules_proto#206 to fix this finally upstream, so we don't need to compile protoc at all.

@aaliddell
Copy link
Member

Excellent, is there anything that needs changing here to support this or is it just a drop-in replacement?

I ask because we have a toolchain for protoc here, for reasons that I do not remember. Presumably this can be ditched and replaced with the official toolchain now in rules_proto?

@alexeagle
Copy link
Contributor

Under bzlmod, the toolchain registration will be automatic so there's nothing to do. WORKSPACE users might have to add a line.

I still have to argue with Google about why this is the right way to do it. They seem to want to go the other direction and make the protobuf repo even more load-bearing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked-external Waiting on external issue/release bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants