Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on Github Actions, Packages, and Buildah #723

Open
sloretz opened this issue Mar 3, 2024 · 8 comments
Open

Discussion on Github Actions, Packages, and Buildah #723

sloretz opened this issue Mar 3, 2024 · 8 comments
Labels

Comments

@sloretz
Copy link
Contributor

sloretz commented Mar 3, 2024

Given the effort that's been made to try to solve #112 , I looked at creating images with Github Actions using buildah and hosting them on Github Packages. It seems like a pretty good option to me, so I'm opening a ticket to discuss if the official images could benefit from any of it.

https://github.com/sloretz/ros_oci_images

About Github Packages I learned:

  • Images are associated with an account or organization, NOT a repo. Official packages could be named ghcr.io/osrf/ros to make an image named ros in the ghcr.io/osrf registry
  • Pushing new versions to Github packages seems to be rate limited. Trying to push all images for a ROS distro at once (12 images + 6 manifests for Rolling) requires backoff and retry logic that might wait for 10 minutes to get a successful push. Pushing them one at time as they are built (roughly 1 image every few minutes) seems to be no problem.

About buildah I learned:

  • Recent versions of buildah are great, but the CLI has already changed a bit since the version in Jammy. I pinned to a specific version and built it from source.
  • Be default all layers are squashed in buildah, so you don't have to work as hard to clean up intermediate layers if you don't want to.
  • I found using buildah and Containerfile/Dockerfile with buildah easy to understand. To be honest I've never really felt like I understood the path from osrf/docker_templates to the Dockerfiles here.
  • I had to play around a bit to find a version of qemu-user-static that would successfully build all of the ROS images. Using Fedora 39 with qemu-user-static 8.1.3 I was not able to build all images on all architectures. The version on ubuntu-latest in Github Actions works fine though.
@sloretz
Copy link
Contributor Author

sloretz commented Mar 3, 2024

Oh, and how it solves #112: A github action checks every 6 hours if the desktop-full variant for a ROS distro could be updated. If so it rebuilds all the images for that distro. I also added a job that runs once a week to rebuild all images to catch non-ROS package updates.

@mikaelarguedas
Copy link
Contributor

mikaelarguedas commented Mar 4, 2024

Thanks @sloretz !

To clarify in 2 lines, the suggestion here is:

  • To move away from dockerhub and the official docker library completely
  • Host build and push images automatically on a periodic basis but do everything on the github infrastructure instead of dockerhub

Is that correct?

(for the following I'm speaking under @ruffsl and @sloretz control, do not hesitate to clarify / correct the statements below)


Current state of things:

  • ros (non gui) images are part of the official docker library.
    • The issue ([ros] Provide exact version numbers to ensure rebuilds #112) is that the only way to trigger rebuilds is to modify Dockerfiles or wait for the base image (ubuntu) to be rebuilt.
    • Our ability to trigger rebuild based on package versions is impacted by how the versionning system works on the ROS buildfarm (the same package for different architecture have different version number as the build timestamp is embedded into the version) so we cant have the same dockerfile for all arches as the package version differ from one arch to the next.
  • ros gui images (desktop, desktop-full..) are hosted on the osrf dockerhub profile. That we can push images to at will
  • On a daily basis:
    • CI checks if the versions on dockerhub our outdated (but as it uses only the major.minor.patch version of the variants packages they very rarely trigger despite new ROS syncs happening)
    • If official images changed, the osrf hosted images are rebuilt and pushed to the osrf's dockerhub

Pros of the approach suggested in this ticket:

  • full control of pushing and hosting ensuring up to date images
  • multi arch build

Cons:

  • ros is not an official docker image anymore (-> impact?)
  • rate limitation on push
    Things to consider
  • hand modification of images instead of generation from templates -> impact on maintenance ?
  • what to do with all images on dockerhub

@sloretz
Copy link
Contributor Author

sloretz commented Mar 4, 2024

To clarify in 2 lines, the suggestion here is: [...] Is that correct?

I'd say the one line summary is "look at this cool thing I made", but more concretely I think some useful changes here would be:

  • support the docker library for the sake of stability, but add actions to check for updates and then build/push images to ghcr.io/osrf/ on a more frequent basis. Support both indefinitely, but mabye move tutorials to ghcr.io when enough time passes that we're confident it's a reliable-enough option.
  • Change "is update available" mechanism to use a apt-get upgrade -s to catch timestamp-only changes
  • Change template system to use Dockerfiles with build arguments. There would be some repetition, but I think it would make the images easier to contribute to.

@ruffsl
Copy link
Member

ruffsl commented Mar 8, 2024

To move away from dockerhub and the official docker library completely

This would be a major change to our infrastructure, not to mention perhaps a bit disruptive or confusing for new and old end users alike. I'm not totally for, nor against, but it would be a serious undertaking regardless.


Logistically for infrastructure,

OSRF's github org would probably require significantly more resource credits to perpetually build and host all official ros docker images. E.g. GitHub Action credits for running CI jobs and GitHub container registry pull counts. Currently, a lot of this infrastructure overhead we offload to DockerHub, who operate the multi architecture docker engines to build our images (for every supported platform), while also hosting the docker image registry (with enough bandwidth and unrestricted aggregate public pull count quotas) for the image repo.

Even for unofficial images under OSRF's own DockerHub org, we still currently benefit from being enrolled in Docker's Sponsored Open Source Program, extending similar less restricting pull quotas for our GUI based images as well. @tfoote can speak to the details on the time and effort it took to get that all initiated and finalized.

I'm guessing GitHub has similar sponsorship programs for open source projects with larger resource usage requirements, but we'd probably want to check into it and get that ball rolling quickly if we decide to migrate, lest we just as quickly get rate limited when building or hosting our ROS images using only GitHub.


From a user perspective,

DockerHub has the historic (perhaps controversial) benefit of being the default image registry for most out of the box container tools. As such, switching registries would of course require users to update all Dockerfile directives, build ARGs, build scripts, CLI muscle memory, etc, to include the added domain name of GitHub's own container registry URI. For example, common approaches such as this would have to be updated everywhere:

ARG ROS_DISTRO=rolling
FROM ros$ROS_DISTRO

ros is not an official docker image anymore (-> impact?)
what to do with all images on dockerhub

That is to say, to make migration simple, we'd probably want to mirror all past ROS images on GitHub's container registry as well. But then to avoid breaking legacy setups, we'd have to leave up the ROS images already published on DockerHub's official image library, not that Docker Hub librarians would let maintainers yank archived images anyhow. Yet this duplication would probably cause a lot of confusion as image tags fall out of sync, regardless of public announcements or deprecation notices posted, given how most folks use official images via the CLI and rarely return to check a repo's webpage for a library image.

@ruffsl
Copy link
Member

ruffsl commented Mar 8, 2024

I looked at creating images with Github Actions using buildah and hosting them on Github Packages.

If switching to Github Actions anyway, what would be the difference between using buildah vs. Github's official "Build and push Docker images" action? Is it the aspect of building images via CLI and scripts? Any advantage over compared to buildkit's native CLI or full python SDK? Although, I don't think the python SKD has full support yet for buildkit.

For example, here is the github action I wrote to efficiently rebuild the Nav2 CI image using buildkit and image layer caching:

Where every day it checks the image to see if any ros packages are updatable:

@ruffsl
Copy link
Member

ruffsl commented Mar 8, 2024

Change template system to use Dockerfiles with build arguments. There would be some repetition, but I think it would make the images easier to contribute to.

A reason we haven't yet adopted the use of build ARGs is of limits for the official images' review pipelines, or how they don't (didn't?) support variable substitution.

Another reason is probably the question of how to set non-default build ARG values via the docker library manifest, which expects a single/static Dockerfile, or AFAIK...

On that last point, hypothetically, if we did migrate away from Docker Hub's official library, we could probably make great use of modern multi-stage builds, where each ROS meta package tag could be in-lined as separate multi stages in a single Dockerfile. This would avoid the need of complex makefiles, and cut down on the number of Dockerfiles and folders.

This is because I think Docker Hub's official image library Instruction Format doesn't yet allow for the added specification of a --target stage when mapping tags to Dockerfiles.

@ruffsl
Copy link
Member

ruffsl commented Mar 8, 2024

Instead, how much more difficult would it be to just bring ROS's packaging versioning scheme closer to something more easily cacheable, yet busted after periodic syncs? E.g denoting the sync as a first class marker in the package version string:

Package: ros-rolling-ros-core
-Version: 0.10.0-2jammy.20240216.184241
+Version: 0.10.0-2jammy.sync-42.20240216.184241

Given the points maid in this comment, I guess this would still bust the build cache for all architectures simultaneously, but I suppose I'd still trade that off to avoid having to hardcode more into Dockerfiles, or maintain more machinery:

@sloretz
Copy link
Contributor Author

sloretz commented Mar 13, 2024

Thanks for the thorough reply Ruffin!

To move away from dockerhub and the official docker library completely
This would be a major change to our infrastructure, not to mention perhaps a bit disruptive or confusing [...]

Agreed. I think at most I would recommend hosting images on both Dockerhub and Github Packages - at least for ROS Distros that already exist on Dockerhub.

OSRF's github org would probably require significantly more resource credits to perpetually build and host all official ros docker images. E.g. GitHub Action credits for running CI jobs and GitHub container registry pull counts.

I wondered about that when I started, but at least on my personal account it seems both storage and data transfer are free. Who knows if Github will keep being this generous forever though.

If switching to Github Actions anyway, what would be the difference between using buildah vs. Github's official "Build and push Docker images" action? Is it the aspect of building images via CLI and scripts?

Building images via the CLI is the main thing I was looking for. When making github actions I've often found it easier to do most of the logic in a script where I can iterate locally.

Any advantage over compared to buildkit's native CLI or full python SDK?

I liked the experience using buildah because it makes it easy to create images without being root, but I don't know of any features buildah has that buildx lacks. I looked briefly at a few Python APIs, but I don't remember why I decided not to use any of them.

Another reason is probably the question of how to set non-default build ARG values via the docker library manifest, which expects a single/static Dockerfile, or AFAIK...

The static Dockerfile requirement does seem like a significant limitation on images can be generated here. I really like how the build argument FROM statement turned out. There's some duplication between ROS 1 and 2, but overall not that much.

Maybe for the purposes of the Docker library we could reduce the Dockerfile generation to replacing the ARG and FROM lines with a static string?

Instead, how much more difficult would it be to just bring ROS's packaging versioning scheme closer to something more easily cacheable, yet busted after periodic syncs?

Ah this is a tough one. The package versions are decided when the package is built, which is long before the sync happens. IIUC the sync just copies the packages from the testing apt repo into the main apt repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants