Local and Registry cache not used or invalidating cache unnecessarily #4910

LucasLundJensen · 2024-05-10T10:38:30Z

Been having some issues when exporting and importing cache from both a registry and locally, we use the buildctl-daemonless example.
Our full use case is being able to spawn N amount of pods in our kubernetes cluster, that each build various applications, and we obviously want cache enabled to make this as fast as possible.

The problem we've ran into with locally storing the cache on a volume that is mounted to the pod, is that the cache will very rarely actually be used, even though the source code is exactly the same, and the Dockerfile is very simple, causing the builds to take much longer than necessary.

Command:

    $BUILDCTL --addr=$(cat $tmp/addr) build --frontend dockerfile.v0 --local context=. --local dockerfile=. \
        --output type=image,name="${IMAGE_REF}",push=true \
        --export-cache type=local,dest=$CACHE_PATH,mode=max \
        --import-cache type=local,src=$CACHE_PATH,mode=max

Dockerfile:

FROM node:18-alpine as build

WORKDIR /app

COPY package.json yarn.lock .npmrc /app
RUN yarn install

COPY . /app
RUN yarn build

FROM scratch

WORKDIR /app
COPY --from=build /app/dist /app/dist

We tried to switch over to use a registry instead, as we got the vibe from the README it was more the intention to use when you had multiple instances of buildctl using the same cache.

This also seemed to do a lot better, the new pods consistently uses the cache now, but upon further testing, we noticed that if the source code were to change (like a simple test.txt file being added, or an index.js file being manipulated), our RUN yarn install would no longer be cached.

This didn't seem correct, and we therefore tested locally with just plain docker, did the exact same code change we had tested with in the pod using buildctl, and Docker did use the cache for the RUN yarn install as expected.

We are essentially running master-rootless version of buildkit, but I'll include the full Dockerfile that we use on our pods.

Dockerfile

ARG ORAS_VERSION=v0.16.0

FROM ghcr.io/oras-project/oras:${ORAS_VERSION} as oras 

FROM moby/buildkit:master-rootless as final

USER 1000:1000

COPY --from=oras --chown=user /bin/oras /usr/bin/oras
COPY --chown=user buildkit.sh / 

WORKDIR /data

ENTRYPOINT [ "/buildkit.sh" ]

The ENTRYPOINT can be ignored for now, we shell into the spawned pod instead and run it manually, it is just the daemonless script

For our specific use case, using local cache in a mounted volume would be the best, as we can ensure the fastest performant cache possible, but if registries are the inteded resolution for this, that's fine too.

The text was updated successfully, but these errors were encountered:

tonistiigi · 2024-05-10T23:30:04Z

we noticed that if the source code were to change (like a simple test.txt file being added, or an index.js file being manipulated), our RUN yarn install would no longer be cached.

This is expected with your Dockerfile above that uses COPY .. This will copy different files, resulting in different end build result. Different files would also be visible to the yarn install command so it would need to run again.

Cache mismatches are common if you use COPY . , for example git clone does not create a .git directory in a deterministic way. If you reduce the files used by your build to only files that you actually need, you reduce the chance of cache invalidation.

If you have a reproducible case/steps where you think cache is not matched properly you can post that and we can look at the specifics.

LucasLundJensen · 2024-05-12T15:56:30Z

This is expected with your Dockerfile above that uses COPY .. This will copy different files, resulting in different end build result. Different files would also be visible to the yarn install command so it would need to run again.

I don't see how the COPY . that is happening after the RUN yarn install step invalidates the previous step?
yarn install is only interacting with with the package.json and yarn.lock file, and running this Dockerfile locally with Docker does not produce the same cache invalidation issues as it does with buildkit.

I tested with pruning all my Docker cache, running it once, adding a test.txt file and running build again, and as shown here, the yarn install is still cached.

=> [internal] load build definition from Dockerfile                                                                                                    0.0s
 => => transferring dockerfile: 261B                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/node:18-alpine                                                                                       0.5s
 => [internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                         0.0s
 => [build 1/6] FROM docker.io/library/node:18-alpine@sha256:4837c2ac8998cf172f5892fb45f229c328e4824c43c8506f8ba9c7996d702430                           0.0s
 => [stage-1 1/2] WORKDIR /app                                                                                                                          0.0s
 => [internal] load build context                                                                                                                       0.0s
 => => transferring context: 446B                                                                                                                       0.0s
 => CACHED [build 2/6] WORKDIR /app                                                                                                                     0.0s
 => CACHED [build 3/6] COPY package.json yarn.lock .npmrc /app                                                                                          0.0s
 => CACHED [build 4/6] RUN yarn install --network-timeout 100000                                                                                        0.0s
 => [build 5/6] COPY . /app                                                                                                                             0.0s
 => [build 6/6] RUN yarn build                                                                                                                          1.2s
 => CACHED [stage-1 2/2] COPY --from=build /app/dist /app/dist                                                                                          0.0s
 => exporting to image                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                 0.0s
 => => writing image sha256:4b77c51068465e05ea438134e137de21bbb3c69f287d78e9ffdd81c767df62cf

I'll create a reproducible case early week, just wanted to get initial thoughts out quickly 😊

LucasLundJensen · 2024-05-23T07:29:01Z

I haven't had the time to setup a reproducible case, as the setup has quite some configuration, so it would take a while to give an exact replica.

We did come up with something that seems to work for local cache, each image being built now stores it's cache in a separate subdirectory, meaning that no builds will ever write to the same cache directory unless it's the same image.

This seems to work, and we have had much better success with the cache not being broken.

Here is a code snippet added to the buildkit.sh for anyone else interested:

cache_subpath="${UNIQUE_IMAGE_ID}"
buildkit_cache="${CACHE_PATH}/buildkit/${cache_subpath}"
mkdir -p $buildkit_cache
cache="--export-cache type=local,dest=$buildkit_cache,mode=max --import-cache type=local,src=$buildkit_cache"

$BUILDCTL --addr=$(cat $tmp/addr) build ${BUILD_ARGS} --frontend dockerfile.v0 --local context=. --local dockerfile=. --opt filename=./${DOCKERFILE} \
        --output type=local,dest=output $cache

LucasLundJensen closed this as completed May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local and Registry cache not used or invalidating cache unnecessarily #4910

Local and Registry cache not used or invalidating cache unnecessarily #4910

LucasLundJensen commented May 10, 2024 •

edited

tonistiigi commented May 10, 2024

LucasLundJensen commented May 12, 2024

LucasLundJensen commented May 23, 2024

Local and Registry cache not used or *invalidating* cache unnecessarily #4910

Local and Registry cache not used or *invalidating* cache unnecessarily #4910

Comments

LucasLundJensen commented May 10, 2024 • edited

tonistiigi commented May 10, 2024

LucasLundJensen commented May 12, 2024

LucasLundJensen commented May 23, 2024

Local and Registry cache not used or invalidating cache unnecessarily #4910

Local and Registry cache not used or invalidating cache unnecessarily #4910

LucasLundJensen commented May 10, 2024 •

edited