Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarf Cache Error #2491

Closed
TristanHoladay opened this issue May 10, 2024 · 3 comments
Closed

Zarf Cache Error #2491

TristanHoladay opened this issue May 10, 2024 · 3 comments

Comments

@TristanHoladay
Copy link
Contributor

Environment

Device and OS: uds-ubuntu-big-boy-8-core github runner
App version: >=v0.33.0
Kubernetes distro being used: no distro involved at the point of failure

Steps to reproduce

  1. kick off CI for the test-cache PR

OR

  1. clone uds core
  2. switch to branch test-cache
  3. ensure zarf version in the setup action (.github/actions/setup) is v0.33.0 or later
  4. ensure the uds task test-uds-core in tasks/test.yaml has create:slim-dev-package after create:standard-package
  5. make a trivial change and push
  6. watch CI, specifically the Schedule (all, upstream, install) / Test job

Expected result

The all upstream install test should pass. More specifically creating the slim-dev-package right after creating the standard package should not fail with zarf cache corruption errors.

Actual Result

Tests of the upstream flavor of uds core, in which the slim-dev-package is created after the standard package, fail with zarf cache corruption errors.

Visual Proof (screenshots, videos, text, etc)

Screenshot from 2024-05-09 12-15-36

Severity/Priority

low
(able to workaround with zarf tools clear-cache / just removing the create:slim-dev-package task from those tests cause not actually needed)

Additional Context

Experimenting on the PR has yielded these results:

  • Does not fail locally
  • The only flavor failing is upstream
  • creating the slim-dev-package (packages/slim-dev/), which is a subset of the standard package images (packages/standard/), first and then creating the standard package doesn't result in a cache error
  • the images that come up consistently in the cache error are docker istio images for the istio packages and identity-config for the keycloak package
  • removing those images from the build inconsistently makes CI pass
  • regardless of which image it is failing on, the cache error always states but only wrote 23
    Core uses uds-cli vendored zarf to create and deploy packages, but to test versions of zarf that aren't included in recent uds-cli version the setup-zarf action was added to the setup action and uds was removed from the create for both slim dev and standard package tasks in tasks/create.yaml.

The test-cache PR commit history shows different configurations tried.

@TristanHoladay
Copy link
Contributor Author

Update: after removing the create:slim-dev-package from the test-uds-core task we didn't have an issue deploying the bundle. However on the validate task there is a test app that gets created and deployed (/src/test/) and we're now seeing the cache error on creation of this package (https://github.com/defenseunicorns/uds-core/actions/runs/9034824568/job/24829078798)

@mjnagel
Copy link
Contributor

mjnagel commented May 10, 2024

This is also happening on some other repositories, ex: https://github.com/defenseunicorns/uds-package-gitlab/actions/runs/8979399292/job/24661697804

It seems relatively consistent that this happens on a second package creation (first one completes, makes a valid package, second one fails during creation due to cache error). I don't think I could confidently say this is tied to a single registry - we've seen it with packages that exclusively pull from registry1.dso.mil, but also packages that pull from ghcr + dockerhub + quay + ...

@lucasrod16
Copy link
Member

This issue was fixed by #2460

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants