-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: push distribution images to GHCR #4130
Conversation
This addition enables pushing distribution images into GHCR. This is in addition to the Docker Hub push which remains in place. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Note: if you want to mirror current images on Docker Hub to GHCR you can use https://github.com/crazy-max/ghaction-dockerhub-mirror
@crazy-max would there be advantages in using that? (thinking if using that would provide a better guarantee that both registries have the exact same digest)? |
You mean the mirror action? If so it should be a one-time use and not push here but use from another repo if user has the credentials ofc. Or just this script https://github.com/crazy-max/ghaction-dockerhub-mirror/blob/5eefa440a0262261065dd3a57cebc860c8e135f0/action.yml#L35-L68 if brave enough 😅 |
The image is built only once by the CI. The very same image is pushed to two distinct registries: I dont understand the point around "the same digest". Pushing the very same image doesn't change its digest. |
Except when the digest changes on push because of various compression and other reasons of course :) |
If the digests change in transit then we are pretty doomed I'd say. Can you give me an example of where/when you've seen a case like that? Assume the current CI behaviour:
If the image itself was built twice then sure: docker builds are still not reproducible under some circumstances, but the build here is done only once; image manifest is generated and all the layers/indices appropriately referenced. If there is a compression in transit, that should not change the digests of the layers as they're specified in the built image manifest. Curious. |
I can't say I have a deep understanding, but I do know that doing a Depending on the capabilities of the target registries, I believe docker can push differently too. It was a throwaway comment really, does it matter if the digests are different on each registry? We should sign the images in some way, and a signature includes the registry/repo/tag too iirc. |
@dmcgowan @corhere can you please confirm this as this strikes mild panic in my body 😄 if I have an image
Yes, it does. If we are going to push to both we must make sure the users are getting the same content uniquely identifiable by the
Yes we should. Referrers API was meant to address attaching signatures (and other MD) to the images but that's got a way to go to be codified in the spec. That being said, I believe |
It doesn't tend to be the registry, it tends to be the clients. For example, skopeo has cases where it will change digest on copy (eg containers/skopeo#1451), I actually added an option to skopeo to make it fail in these cases instead of pushing a modified manifest. A docker images doesn't actually have a digest just from building it, but docker does store digests from the registry when pulling them now, I don't know if it ensures the same digest if it is then pushed to a different registry. An example of a simple pull->tag->push resulting in a different digest, even with the repos both being on docker hub:
|
We've found the only way to get digest stability when pushing to multiple repos is to push to one repo, and then copy it from there to everywhere else with skopeo and its |
Yes, as I said earlier
Emphasis on the image is pushed by the same client 😄 which is the case in our CI pipeline. I see no issue with this PR 🤷♂️ but again, I'm willing to be persuaded and learn something new. I know the clients can be silly sometimes, but that's not our concern: our concern here is making the identical image available from different registries |
Yes, what we're doing should be safe, but I don't believe it is guaranteed to be safe. There's a reason docker stores multiple repo digests for each image in the local store. And like I said, I don't think it matters if the digests are different in each repo. Also, I honestly thought I had already approved this, my comments were certainly not saying that I thought the PR shouldn't be going in :) |
PTAL @thaJeztah |
It depends... With the containerd image-store integration, pushes are reproducible, because the image, image-manifest, and compressed layers are created once, and stored on disk. Pulling an image also keeps the distribution image (and layers) in the content store. This also means that images from different registries (or using different compression) will be stored twice, but may (after extracting) produce the same image. This is an essential difference in design between containerd's image store and the original design of the docker daemon image store (using "graph-drivers"). The graph-driver store was designed around the local image; the image config and un-compressed layers. These layers are fully reproducible, and when pulling an image it would extract the layers (but using tar-split to reconstruct the original, uncompressed tar). However it does not keep any of the distribution artifacts, as the runtime has no use for the compressed layers, nor manifest-index, or manifest; those are discarded (but metadata about them kept); the design here was both to only handle the artifacts used for the runtime itself (distributing it being a separate concern), as well as optimizing for storage. When pushing an image to the same registry, it would detect layers that are not modified, and would not even re-push them, but when pushing to a different registry, it will create new content (manifest, compressed layers), which is non-reproducible; so pushing to 2 different registries will result in different digests. This scenario was less common, as well as "transfer content between registries" (pull from one, push to another) mostly out of scope, and better suited for dedicated tools. Containerd's design was different from the start, and more focus on reproducibility of distributed artifacts, at the cost of storage (considered to be "cheap" for production scenarios). So containerd keeps the distributions artifacts when pulling, keeps BOTH compressed, and uncompressed / extracted artifacts (and when using multiple snapshotters; multiple copies of extracted artifacts). As to the bake-action; I think it does not use the docker (graph-driver) image-store and produces distribution artifacts once (pushing multiple times), but, honestly not deeply familiar with the approach it takes; I'd have to quickly defer to @crazy-max on that one. |
Thanks, @thaJeztah this was a super interesting insight into how all the machinery works. I'm not gonna lie, it made me feel intensely depressed on the one hand (how we accepted this in our industry kinda boggles my mind), but very happy on the other (my personal laptop is running off NIX store where I dont have to deal with this 😄 ). This opens a question though, should we take a different approach to this PR? Any recommendation @crazy-max ? |
Just thought I'd mention I also do this in the #4083 PR for the release automation already. |
This enables pushing distribution images into GHCR.
NOTE: This is in addition to the Docker Hub push which remains in place unchanged
Closes #4125
PTAL @crazy-max 😄