Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify repository vs registry terminology #325

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
11 changes: 6 additions & 5 deletions spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ These headers are OPTIONAL and clients SHOULD NOT depend on them.
Several terms are used frequently in this document and warrant basic definitions:

- **Registry**: a service that handles the required APIs defined in this specification
- **Repository**: a namespace within the registry, commonly represented here as `<name>` between the `/v2/` and API call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository is a bit more than just a namespace..

hmm..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any suggestions?

Copy link
Member

@mikebrow mikebrow May 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... a concept in registries for isolating API requests from one connection to another, such as for isolating public and private access to the registry in the execution of the APIs in this specification. Referred herein as a namespace in the registry and as <name> in the API examples. This specification does not limit the registry implementation of repositories to any type of namespace or namespace model, nor does this specification define how the registry performs isolation/authentication of the connections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikebrow if we strip out the "in the registry" text from other parts, would a more minimal definition of Repository that doesn't get into isolation and authentication make sense? I'm trying to mirror the other definitions in their brevity.

Copy link
Member

@mikebrow mikebrow Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudo-bmitch even after removing the "in the registry" text we still have a number of uses of the term repository throughout the spec that hint around at the normal uses of the term. The last sentence could be moved to the requirements section to make it clear that although the specification mentions repositories, it does not restrict or define their namespace isolation models authentication apis etc..

Copy link
Member

@mikebrow mikebrow Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note the "definition" part was fairly thin.. "a concept in registries for isolating API requests from one connection to another" the rest is examples, the other commonly used terms, and the disclaimer...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Repository**: a namespace within the registry, commonly represented here as `<name>` between the `/v2/` and API call
- **Repository**: a concept in registries for isolating API requests from one connection to another -- such as for isolating public and private access to the registry in the execution of the APIs in this specification. Referred herein as a namespace in the registry and as`<name>` in the API examples. **Note**: This specification does not limit the registry implementation of repositories to any type of namespace or namespace model, nor does this specification define how the registry performs isolation/authentication of the connections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circling back to this, the reasoning for using the "namespace" term is because we defined it with the following text throughout the spec:

<name> is the namespace of the repository

I'm avoiding mentioning access restrictions because the spec doesn't go there anywhere else and we don't have a spec for authentication and authorization defined yet (though I would like to see that added). I think the most we can say is that references with a descriptor (index to manifest, manifest to blob) are within a repository and don't cross that boundary. The best word I can think of for that is "namespace" since we've used it so many other places in the spec already.

a concept in registries for isolating API requests from one connection to another

I'm not a fan of defining it based on connections. It implies that the pull of a manifest in one connection could reference blobs in another repository by having a separate connection for that blob. We've leaned pretty heavily on saying that all descriptors are references within the same repository and registries depend on that for implementing access and GC on top of this spec.

- **Client**: a tool that communicates with Registries
- **Push**: the act of uploading Blobs and Manifests to a Registry
- **Pull**: the act of downloading Blobs and Manifests from a Registry
Expand Down Expand Up @@ -160,7 +161,7 @@ If the digest does differ, it MAY be the case that the hashing algorithms used d
See [Content Digests](https://github.com/opencontainers/image-spec/blob/v1.0.1/descriptor.md#digests) <sup>[apdx-3](#appendix)</sup> for information on how to detect the hashing algorithm in use.
Most clients MAY ignore the value, but if it is used, the client MUST verify the value against the uploaded blob data.

If the manifest is not found in the registry, the response code MUST be `404 Not Found`.
If the manifest is not found in the repository, the response code MUST be `404 Not Found`.
Copy link
Member

@mikebrow mikebrow May 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository is optional here...

maybe registry, or repository if restricted to a repository in the registry,

wouldn't say it each time though just this once?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these "in the registry" statements could almost just be removed. I agree it is up to the implementation to determine the scope of a blob or manifest. Repository scope makes the most sense but that might need more explanation above or as part of the implementation guide.


##### Pulling blobs

Expand All @@ -173,9 +174,9 @@ A GET request to an existing blob URL MUST provide the expected blob, with a res
A successful response SHOULD contain the digest of the uploaded blob in the header `Docker-Content-Digest`.
If present, the value of this header MUST be a digest matching that of the response body.

If the blob is not found in the registry, the response code MUST be `404 Not Found`.
If the blob is not found in the repository, the response code MUST be `404 Not Found`.

##### Checking if content exists in the registry
##### Checking if content exists in the repository

In order to verify that a repository contains a given manifest or blob, make a `HEAD` request to a URL in the following form:

Expand All @@ -186,14 +187,14 @@ In order to verify that a repository contains a given manifest or blob, make a `
A HEAD request to an existing blob or manifest URL MUST return `200 OK`.
A successful response SHOULD contain the digest of the uploaded blob in the header `Docker-Content-Digest`.

If the blob or manifest is not found in the registry, the response code MUST be `404 Not Found`.
If the blob or manifest is not found in the repository, the response code MUST be `404 Not Found`.

#### Push

Pushing an artifact typically works in the opposite order as a pull: the blobs making up the artifact are uploaded first, and the manifest last.
A useful diagram is provided [here](https://github.com/google/go-containerregistry/tree/d7f8d06c87ed209507dd5f2d723267fe35b38a9f/pkg/v1/remote#anatomy-of-an-image-upload).

A registry MAY reject a manifest of any type uploaded to the manifest endpoint if it references manifests or blobs that do not exist in the registry.
A registry MAY reject a manifest of any type uploaded to the manifest endpoint if it references manifests or blobs that do not exist in the repository.
When a manifest is rejected for this reason, it must result in one or more `MANIFEST_BLOB_UNKNOWN` errors <sup>[code-1](#error-codes)</sup>.

##### Pushing blobs
Expand Down