Skip to content

Making Community CI Work

Flynn edited this page Mar 13, 2023 · 1 revision

A Modest CI Proposal

for Preventing the Testing of new Submissions from being a burden on their Creators or Community, and for making them beneficial to the Publick.


Let's start with some points that I think have widespread support:

  1. Emissary wants to continue to use GitHub actions for CI.
  2. Emissary's CI system should work for contributions from the community, not just from Emissary maintainers.
  3. Emissary's build system should be simpler to maintain and extend.
  4. Emissary's CI system should be simpler to maintain and understand, too.

In this proposal, I'll talk about CI. The build system will come later.

Specifically, I'm going to talk about point 2. We're already using GitHub Actions (point 1), and meaningfully simplifying CI is probably a thing that can wait for the build system (points 3 & 4).

Why Doesn't CI Work for Contributors?

There is actually only one reason that CI won't work for community contributions right now: the CI process requires pushing Docker artifacts to the Emissary Docker registry, and community contributions don't have access to the secret that would allow that.

The obvious solution here would be simply to push to an internal registry, rather than to a public registry. While enabling an internal registry for the build step is easy, GitHub Actions won't allow that registry to persist across jobs. The canonical solution here is that:

  • In your first job, you push to the registry, then use docker save to create a tarfile of the artifacts you pushed, then you save that tarfile as a GitHub Actions artifact.

  • In your second job, you pull in the GitHub Actions artifact, then use docker load to load its contents into a new internal Docker registry.

Leaving aside the extra time required for docker save and docker load, it turns out that implementing this in the Emissary build system is hard: the stamp system that the build system uses to try to avoid calling docker build on artifacts that have already been built means that just calling docker save and docker load won't work.

Since significant build-system work is deliberately out of scope for now, let's back up and ask a different question:

Why do we need to push the artifacts at all during CI?

The answer to this question is actually buried in a design goal from years ago: the maintainers all felt strongly that we should build once, test the bits produced by those bits, then ship the same bits if all was well. (We later relaxed this to allow putting another Docker layer over the tested bits, so that we could update the version number presented to the user.)

Suppose we relax it further, and declare that it's OK to rebuild between test and release as long as we build from exactly the same Git commit? How would that affect everything?

The build system

The build system doesn't need to change: make images just leaves images in the local Docker cache, and make push will continue to push anywhere you tell it to. Done.

(This also implies that existing dev loops shouldn't need to change: however you build things to test with, that should still be OK.)

Releasing

Releasing needs a fairly minor change: instead of doing a special release build that's just adding one more layer to tweak the version number from an RC build, it'll need to do a "normal" build with a GA version number. This is likely to be simpler than what we currently do.

The CI system

This is where the work will be. It boils down to:

  • Build at the beginning of the test phase (which, as noted above, leaves the images in the local Docker cache.)
  • Do not push the images anywhere.
  • Use k3d image load to copy the image from the local Docker cache to the K3d cluster that the tests are set up to run.
  • Run the tests as usual.

Proposal

In summary, the proposal here is:

  • Declare that it's OK to rebuild images between test & release, as long as we're building from the same Git commit.
  • Use the freedom of that declaration to simplify CI by building images before the test phase, so that there's no need for docker push during CI.