Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

garbage-collection clarification needed #406

Open
rogpeppe opened this issue Apr 28, 2023 · 18 comments
Open

garbage-collection clarification needed #406

rogpeppe opened this issue Apr 28, 2023 · 18 comments

Comments

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 28, 2023

As a newcomer to OCI registries, looking through the docs, I wasn't able to easily find out how the overall referential data model works. For example, until I discovered otherwise, I assumed that a manifest in one repository would be able to directly reference a blob that was uploaded to another repository within the same registry. Likewise, I haven't found a place that explicitly defines when a blob might be deleted by the garbage collector.

I came up with the following form of words to try to explain my understanding of the rules as I've come to understand them over the last few days. Does this accurately describe the intended semantics? Perhaps the spec could contain something like this.

GC rules for OCI registries

A registry holds a set of repositories. Each repository in a registry is logically separate from all other repositories, although actual content MAY be shared between them.

A repository holds a set of objects that it retains references to. When an object is no longer needed by a repository, it is released, which MAY remove the underlying data if it's not referred to by any other repository.

The object of garbage collection is to release any objects that aren't needed by the repository. After garbage collection, all objects not marked as live are released.

When a manifest or tag is created, any objects that it references MUST be live.

There are three categories of object: a manifest, a manifest list, and a blob. Blobs are data-only: they do not hold references to other objects.

The set of live objects is defined as follows:

  • for all tags T, the object referred to by T is live.
  • if a manifest list is live, all the manifests in the list are live
  • if a manifest is live, its config blob and all blobs in its layers are live.
  • if object B is live and manifest A has subject B, A is live.

All live objects MUST be retained by the repository and not deleted.

See also #378

@sudo-bmitch
Copy link
Contributor

Calling them rules implies there's a spec or policy that applies to every registry. We've avoided defining anything in the OCI spec, possibly to an extreme, and delegated that to each implementation. There are lots of policies and various implementations, each doing this differently. Several examples include:

  • ttl.sh deletes anything after a deadline, but at last check their delete was broken for multi-platform content.
  • distribution/distribution has an issue that's been open for a long time since the deletion of untagged manifest also deletes untagged platform specific manifests that are listed in a tagged index (their implementation was not recursive).
  • Docker Hub retains untagged manifests that were at some point in the past tagged, and that tag still exists (but may now point to a new digest).
  • Organizations may implement their own retention policies on their registry, e.g. deleting builds that failed CI after a week, and builds that never reached production after 3 months.

Most registries will not delete a blob if a manifest references that blob. Tags are a common way to indicate a manifest should not be removed by GC.

If we want to begin defining expected GC policies, it may be best to create a new working group to review the various implementations and identify what, if anything, we can define in a spec.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 28, 2023

The way I see these rules is that they specify when a registry GC implementation MAY release an object, not whether it MUST, or even SHOULD. To my mind that's the most important aspect: if there are no rules, then there's no assurance that a given object won't be deleted underfoot by some rogue GC implementation that's nonetheless operating according to the underspecified rules currently in play. Maybe that's OK, but ISTM that everyone in practice depends on the rule that (for example) a manifest and the blobs it refers to won't be deleted as long as there's a tag that refers to the manifest.

Another thing, and not strictly GC-related: is there anywhere that specifies that the namespace for digests within a repository is restricted to that repository. That is, it's (at least on some registries!) necessary to push a given blob to a repository even though we know that it's already part of some other repository in the same registry. Or is that another thing thar's implementation-defined? This behaviour was certainly a surprise to me when I wrote my initial PoC code, even though it makes total sense in retrospect.

@sudo-bmitch
Copy link
Contributor

I attempted to clarify that references are specific to a repository, rather than anything in a registry, but that hasn't been merged and I'm not sure what will unblock it.
#325

@rogpeppe
Copy link
Contributor Author

I guess my main question is, as someone would like to write client code that stores artifacts in an implementation-agnostic fashion, do the rules I've stated actually correspond correctly to the de-facto rules understood and implemented by most registries? Are there any notable exceptions that behave differently?

@sudo-bmitch
Copy link
Contributor

Typically, registries will keep manifests that are tagged and child objects of those manifests, recursively (manifests listed in an index, and blobs listed in an image).

Exceptions I can think of include:

  • registries that preserve untagged manifests
  • registries that implemented retention policies to delete tags
  • long standing defects in the GC code provided with the reference implementation
  • content manually deleted by users (there is a blob delete API)

The subject/referrers API will also impact this. Ideally as long as the subject manifest exists, the referrer to that manifest (the one with the subject field) would not be subject to GC.

@SteveLasker
Copy link
Contributor

Hi @rogpeppe,
These are great questions, that ultimately are mostly implementation details.
For instance when you push an artifact to repo1, that has 2 blobs, they both get pushed.
When you push that same artifact to repo2, in the same registry, the blobs will typically get de-duped. However, that's largely an implementation detail around saving space,, managing security and providing concurrent pull performance. While the user that pushed the artifact to repo1 and repo2 has push permissions, another user may only have pull permissions to repo2.

The details are all about how the registry implements ref counting on the de-duped objects.
The reference types are the newest which adds some complexity, as they're a reverse lookup. In your above scenario, if you push net-monitor:v1 to the registry, then push some statements, signatures, sboms, etc. those all hang off the net-monitor@sha256:abc123 image and digest. You can and should be able to delete any of the children, and if you delete the parent, the child references should be deleted, otherwise, you wind up with zombined objects. The should is the flexibility.

I'm looking if the distribution spec defines lifecycle management, but I'm not finding it, yet. @sudo-bmitch, has this been queued up?
Here's a writeup that framed SHOULD vs. MUST to provide registry implementations options for how they wanted to serve their users: Lifecycle Management.

@rogpeppe
Copy link
Contributor Author

One thing that my rules don't make clear (and that I'm not clear on myself tbh), is whether it's possible that an object pushed to the blobs endpoint can potentially maintain references to other blobs. For example, if I upload a blob with media type application/vnd.oci.artifact.manifest.v1+json to the /v2/foo/blobs/sha256:$digest, is it possible that some implementations might see that media type and say "look! a manifest!" and keep alive the digests that it references? Or is there a hard line between objects uploaded to blobs (that can't keep things live) and objects uploaded to manifests (that can) ?

@SteveLasker
Copy link
Contributor

Manifests are uploaded through the manifest api. They just happen to be persisted as blobs, also as a detail.
Manifest are the anchors for GC tracking. Blobs are really an implementation detail and are largely ignored by the registry. Again, this isn’t defined in the specs, but users interact with manifests through digests and tag references. Blobs are served as a result of a manifest exchange with a client to negotiate how the blobs are returned.
One exception is registry UX cracks the config object to display platform info for container images, but that’s also based on the manifest.config reference, and specific to runtime container images

@hdonnay
Copy link

hdonnay commented Apr 29, 2023

I don't think

  • if a manifest list is live, all the manifests in the list are live

is necessarily true.
The bit from the spec that says

A registry MAY reject a manifest of any type uploaded to the manifest endpoint if it references manifests or blobs that do not exist in the registry.

makes me think that a client may be able to copy a manifest list but only the manifest it knows a priori that will be used. This would make it possible to mirror content to a cluster-local registry without breaking a signing scheme, for example.

@SteveLasker
Copy link
Contributor

Good point, @hdonnay Manifest lists are their own beast, and not consistent across implementations. There’s also some interesting interpretations that have evolved for putting blobs in a “manifestList”, which would be interesting to see what registry implementations have done when the manifest list is deleted. Would the blobs get deleted?
I don’t want to get too off topic. I’d suggest we focus on manifests and layers|blobs here, and open a separate issue for manifest lists as that will be a much longer thread

@rogpeppe
Copy link
Contributor Author

Another thing that's not entirely clear to me: is it OK for a manifest to contain a reference to another manifest from the same repository as one of its layers/blobs?

@SteveLasker
Copy link
Contributor

SteveLasker commented Apr 29, 2023

There was some attempts at defining a collection of descriptions, but it was felt too complex to track lifecycle:

Here’s a more explorative attempt;

@rogpeppe, what are you looking to achieve? That might help prompt some ideas to guide a path

@rogpeppe
Copy link
Contributor Author

@rogpeppe, what are you looking to achieve? That might help prompt some ideas to guide a path

Ideally I'd like it to be the case that when someone like me comes to the OCI project for the first time, they have enough information that they can meaningfully use the API, understanding the basic data model, what guarantees it provides, and what it doesn't.

Without that, it seems hard to me to be able to write client code that meaningfully works correctly across different registry implementations, or a registry implementation that fulfils client expectations.

If there are particular aspects that vary across different registries (like, for example, if there's a registry that doesn't understand about references at all and removes blobs regardless, which would be within the letter of the spec AIUI, if not the spirit), then perhaps some page could provide a matrix of features vs implementations to give some idea to readers of what's "normal".

For myself, I don't need anything at this point, because by writing experimental code and running it on a couple of different implementations, I think I have a grasp of the generally understood rules. But I may well be wrong, because I haven't tried all implementations, and I'm sure there are some outliers there.

As an example of a point where the current state of affairs seems to become problematic to me, from @sudo-bmitch:

Most registries will not delete a blob if a manifest references that blob.

This sounds to me like there's at least one registry out there that will delete a blob even when there's a manifest that references it. Doesn't that break every expectation that people might have of a registry? I upload my docker image, tag it, and suddenly I can't use it because the registry has decided to remove one of the layers it references. I understand that the spec needs to reflect actual current implementation behaviour, but SHOULD and MAY are available, and I am totally the naive outsider here, but isn't it possible to set some expectations in the spec in this respect, at least?

@sudo-bmitch
Copy link
Contributor

As an example of a point where the current state of affairs seems to become problematic to me, from @sudo-bmitch:

Most registries will not delete a blob if a manifest references that blob.

This sounds to me like there's at least one registry out there that will delete a blob even when there's a manifest that references it. Doesn't that break every expectation that people might have of a registry? I upload my docker image, tag it, and suddenly I can't use it because the registry has decided to remove one of the layers it references.

@rogpeppe have you had a chance to look at https://ttl.sh/. I'd also recommend looking at the following issue: distribution/distribution#3178

@sudo-bmitch
Copy link
Contributor

One thing that my rules don't make clear (and that I'm not clear on myself tbh), is whether it's possible that an object pushed to the blobs endpoint can potentially maintain references to other blobs.

A blob could reference another blob, or anything, but it's typically opaque data to the registry (exceptions include parsing the image config for a UI, and scanning layer content for vulnerability reports). Registries should only look at manifest content when they implement their GC policy. I suspect there would be a lot of push back if OCI defined blob content that changed that assumption.

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Apr 29, 2023

I don't think

  • if a manifest list is live, all the manifests in the list are live

is necessarily true.
The bit from the spec that says

A registry MAY reject a manifest of any type uploaded to the manifest endpoint if it references manifests or blobs that do not exist in the registry.

makes me think that a client may be able to copy a manifest list but only the manifest it knows a priori that will be used. This would make it possible to mirror content to a cluster-local registry without breaking a signing scheme, for example.

Yes, there's an effort to explicitly support a "sparse manifest" where someone would only mirror the platforms they intend to run in their cluster (copy the entire manifest list byte for byte, but only copy the linux/amd64 and linux/arm64 child images if you don't have mainframes in your environment). And registries that enforce consistency on a manifest list will typically allow you to delete one of those child manifests after the manifest list has been pushed.

@rogpeppe
Copy link
Contributor Author

@rogpeppe have you had a chance to look at https://ttl.sh/. I'd also recommend looking at the following issue: distribution/distribution#3178

Both interesting reads, thanks! (I will probably use ttl.sh now that I know about it).

FWIW the former seems like it could still observe the spirit of things by ensuring that a manifest with a longer TTL can keep a blob with a shorter TTL live. Deleting tags is always possible, and ISTM that all ttl.sh is doing is deleting tags after some time period, not necessarily violating the reference constraints.

The latter seems more like a tooling bug to me, although I probably don't properly understand the issue after scanning it briefly. Shouldn't docker manifest create mount the descriptors in the manifest list into the destination repository before creating the manifest list? Shouldn't the registry complain that the descriptors don't exist in the destination repo?

Yes, there's an effort to explicitly support a "sparse manifest" where someone would only mirror the platforms they intend to run in their cluster

That's an interesting use case. I guess one way of doing that while abiding by the suggested rules would be to push a manifest that points to the original manifest as a blob, and maintains links to the required remaining blobs directly. That way any referrers to the original manifest would be maintained but the GC would be free to drop the unwanted content.

Registries should only look at manifest content when they implement their GC policy.

Is that "should" anywhere in the spec currently? :)

@sudo-bmitch
Copy link
Contributor

The latter seems more like a tooling bug to me...

Yes, it is a bug (or open issue) that's existed for over 3 years in the reference implementation of the registry. I think that still covers your question of whether any registries have a GC that would delete a blob when the manifest is still tagged.

Registries should only look at manifest content when they implement their GC policy.

Is that "should" anywhere in the spec currently? :)

If you're looking for guarantees, I'll reiterate that there are none from OCI. None of this is in the spec yet, so you'd need to check with individual implementations or make sure your code handles the errors. There are a lot of ways even outside of GC that content can get into an inconsistent state. And even if changes were made to the spec (see the working group process) there are quite a few major registries not listed in the OCI conformance page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants