Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clarifaction for WASM OCI artifact use case #1137

Closed

Conversation

jsturtevant
Copy link

I would like to propose a small change to the wording of for OCI artifacts. This comes up from the use case from WASM where we run the wasm components in a runtime like containerd and using the application/vnd.oci.image.config.v1+json is required. We are using the artifactType to distinguish between a standard image and wasm OCI artifact.

Some details are on how this works in containerd is in https://docs.google.com/document/d/11shgC3l6gplBjWF1VJCWvN_9do51otscAm0hBDGSSAc and a sample using containerd/runwasi#147

@jsturtevant jsturtevant changed the title Add clarifaction for WASM usecase Add clarifaction for WASM use case Oct 6, 2023
@jsturtevant jsturtevant changed the title Add clarifaction for WASM use case Add clarifaction for WASM OCI artifact use case Oct 6, 2023
@sudo-bmitch
Copy link
Contributor

It's always felt wrong to call wasi/wasm images an "artifact". They are runnable images, with a different platform, and which use a different runtime. Is there a reason to call them an artifact instead of a runnable image?

@jsturtevant
Copy link
Author

jsturtevant commented Oct 6, 2023

In Runwasi, the current approach is to package the WASM modules/components into a container using docker but we've been working on ways to avoid this to help with deduplication when pulling modules/components and other efforts (some more outlined in https://docs.google.com/document/d/11shgC3l6gplBjWF1VJCWvN_9do51otscAm0hBDGSSAc):

The current approach looks this for a WASM OCI Artifact:

regctl manifest get localhost:5000/wasi:oci
Name:                                localhost:5000/wasi:oci
MediaType:                           application/vnd.oci.image.manifest.v1+json
ArtifactType:                        application/vnd.bytecodealliance.module.v1+wasm
Digest:                              sha256:c74cb8cb19ee2632b4ae825c71780f48e3424cae6fc1e865cb34b1f03da35c58
Total Size:                          839.916kB

Config:
  Digest:                            sha256:30beeddd35f0d22d3c068f29fc6fb41bb9461807e427d3ebdf34d3b922704a4b
  MediaType:                         application/vnd.oci.image.config.v1+json
  Size:                              136B

Layers:

  Digest:                            sha256:499237ead273693b70fb4110b179226fe857efb6da0865fd8f8c98437f8c4467
  MediaType:                         application/vnd.bytecodealliance.wasm.component.layer.v0+wasm
  Size:                              146.992kB

  Digest:                            sha256:270f42cf04ef6dd5b5d223d14bf9a6c703f1f0df7dded6db897997006fa2ca26
  MediaType:                         application/vnd.bytecodealliance.wasm.component.layer.v0+wasm
  Size:                              692.031kB

@sudo-bmitch
Copy link
Contributor

Given the custom layers in these images, would it make sense to use a custom config media type too? E.g. changing application/vnd.oci.image.config.v1+json to something like application/vnd.bytecodealliance.module.config.v1+json? Looking at the proposal, I'm seeing that they went with application/vnd.w3c.wasm.config.v1+json.

@jsturtevant
Copy link
Author

Given the custom layers in these images, would it make sense to use a custom config media type too?

As of today, We haven't come across anything specific for WASM in the image config that isn't met by image media config type of application/vnd.oci.image.config.v1+json. In fact, this config type is useful in the sense that it lets the folks override entry-point and specify Environment variables. It feels like this configuration is pretty close to what WASM needs since we are ultimately configuring a runtime.

One of the goals was to create an Artifact type that could be used in contianerd today with out major changes. We could give it a different name and keep the format the same but I feel like that would just cause confusion.

Could there be specifics that are needed for WASM? Maybe, though we are really early in the adaption and implementation phase (WASI preview2 support hasn't landed yet). Since we've got this working without major hiccups across a wide set of runtimes (wasmer, wasmtime, spin, wasmedge, ...) it seems right to use what is working today and gather feedback.

Maybe in the future we would want to extend/change the format or change runtimes to know more about WASM artifacts but as of now we don't have any specific requirements. I also believe any changes can be handed in a backwards compatible way.

By using the application/vnd.oci.image.config.v1+json media type we let folks use it today and gather feedback so we can improve based on the usage.

Looking at the proposal, I'm seeing that they went with application/vnd.w3c.wasm.config.v1+json.

This was an example and output wasn't updated after feedback. There was an update to address the feedback using application/vnd.w3c.wasm.config.v1+json to move it to application/vnd.oci.image.manifest.v1+json to the text that states:

While not strictly necessary in current proposal, by specifying the artifact type it will give runtimes ability to identify and handle scenarios related to wasm if necessary. While the media type will remain application/vnd.oci.image.manifest.v1+json to facilitate usage in containerd and configuring the runtime attributes.
In the future it may be determined that the image/runtime config may need to add fields specific for wasm as is the case for Windows/Linux.

@sudo-bmitch
Copy link
Contributor

It feels like this configuration is pretty close to what WASM needs since we are ultimately configuring a runtime.

Probably the reason I'm struggling with this, and it feels like a square peg / round hole scenario, is that artifacts should not be executed by a runtime. So by using the artifactType for WASM content, we're blurring a line that runtimes may want to keep well defined. Would an annotation to select a runtime make sense?

@jsturtevant
Copy link
Author

Probably the reason I'm struggling with this, and it feels like a square peg / round hole scenario, is that artifacts should not be executed by a runtime.

Why not? Is there a definition of artifact that excludes this? I was poking around around and found that sigularity handles both OCI image formats and their own artifact type.

So by using the artifactType for WASM content, we're blurring a line that runtimes may want to keep well defined.

We are already executing WASM with runwasi. The wasm modules are packaged into a container image but we had to teach the shims how to read these packages specifically. Right now all the wasm files are assumed to be in the root of the container. With the OCI artifacts, trying to make this more explicit, reduce duplication of the modules and make them usable in various settings. For instance Spin/slight can't directly run the images we publish in runwasi and runwasi can't use the OCI Artifacts produced by them. We also don't want every different runtime to need to create there own artifact type.

Would an annotation to select a runtime make sense?

Could you explain?


Maybe another alternative is to specify our own image.config like application/vnd.bytecodealliance.module.config.v1+json for version 1 and specify that the contents are expected to be parsed in the format of application/vnd.oci.image.manifest.v1+json until we identify a specific WASM configuration? I am wary of this for reasons I've mentioned but it would meet the definitions here with out modification.

@tianon
Copy link
Member

tianon commented Oct 9, 2023

For what it's worth, I'm also confused by the distinction; in the data model this repository describes today, "image" is the traditional tar based layered container image, and "artifact" is a generic wrapper for anything else (I've actually been considering setting artifactType on my traditional images to make that even more clear; everything we store is an artifact, and some artifacts have a well defined meaning specified in this repository, namely indexes and [classic] images).

@devigned
Copy link

Another benefit to the application/vnd.oci.image.config.v1+json is that registries know how to parse this and display information about the image. If an opaque config like application/vnd.bytecodealliance.module.config.v1+json was used, I think registries would probably skip trying to parse the config and not show the metadata in the UI.

@sudo-bmitch
Copy link
Contributor

sudo-bmitch commented Oct 10, 2023

I'm trying to look at this from the runtime side without fully understanding their implementations. Is there anything they need or want from the image spec to know when to attempt to run a container image? If downstream runtimes don't have any requirements from us, then I have no objection to relaxing the spec. (By runtimes, I'm including older versions of containerd+runc, podman, and other traditional runtimes, not just upgraded installs with the wasm support.)

Would an annotation to select a runtime make sense?

Could you explain?

It feels like artifactType is being used as a signal to wasm runtimes to know if/how the content should be executed. If we just need a text field, a well know annotation set on the manifest can also do that. There's also the platform variant field that we use to distinguish different versions/types of a specific platform.

@tianon
Copy link
Member

tianon commented Oct 10, 2023

Currently, runc does not, and probably will not, support WASM workloads. While it doesn't care about the mediaType/artifactType directly, those are the fields that would be used by something like containerd to swap from runc to a different runtime that can handle them appropriately.

@tianon
Copy link
Member

tianon commented Oct 10, 2023

In other words, I'm in favor of the intent of this PR (although haven't yet reviewed the contents). Having WASM use the image config media type seems fine if it fits their needs as defined, as long as they also specify an appropriate artifact type for higher level runtimes to handle their layers and actual runtime correctly.

@tianon
Copy link
Member

tianon commented Oct 10, 2023

Older containerd would probably balk appropriately at the custom layer media type (since it wouldn't know how to extract them into the snapshotters).

Copy link

@nilslice nilslice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo fix

manifest.md Outdated Show resolved Hide resolved
@sudo-bmitch
Copy link
Contributor

It sounds like there are a few options:

  1. This PR seems to be adding an exception to our exception description.
  2. Remove the exception description, leave it up to runtimes to figure out what they should try running and keep any guidance out of the image spec, and define images as only OCI images with a specific media type on the config and layers.
  3. Drop any definition of images, call everything artifacts, and recommend the artifactType for all manifests.
  4. Change the definition of images to be anything that could be executed by a runtime, artifact are everything else, and specify in the spec how runtimes should differentiate between the two.

I had been going down the path of 4, but it doesn't sound like others agree. I'm not a fan of 1, there's really no value to users of the spec to see guidance full of open ended exceptions. So I think it's a choice between 2 or all the way to 3 unless there are other options I've missed.

@devigned
Copy link

devigned commented Oct 10, 2023

@sudo-bmitch perhaps for 2 it could be restated that runtimes should be able to run images that have an image config and no artifactType specified. Perhaps, also add "may specify artifactType of ${X}" to further reinforce the fact it's a runnable image.

@thomastaylor312
Copy link

Hopefully this is some helpful context from another voice in the wasm space (wasmCloud in particular). So for me personally I don't care too much about the content type as we can make whatever we need work, but I will say that limiting it to be application/vnd.oci.image.config.v1+json is also confusing as an indicator that it is for a runtime. First (and least important), is that many wasm projects don't actually run a container or container-related runtime so having to key off of this being an "OCI image" seems just as confusing to me as other options. To be clear though, this is the least of my worries as it really isn't hard to say "key off of the oci image type" inside of my code consuming a manifest/image

Second, and more important, is that in context of the component model a web assembly component (what could also be called a module) is both an executable unit that could be run inside of a runtime, but it is also an artifact in that it can be glued together with other components. With that in mind, I honestly don't know what to recommend here, but I think it is important context to have in this discussion.

tl;dr

  1. Not everything wasm is going to be executing in a container-centric context
  2. Wasm components can be artifacts and runtime-executable things at the same time

@jsturtevant
Copy link
Author

Older containerd would probably balk appropriately at the custom layer media type (since it wouldn't know how to extract them into the snapshotters).

No major changes where required to get this working in the containerd versions supported. I've merged to backport PR's to containerd for this support (containerd/containerd#9149 and containerd/containerd#9150).

@sajayantony
Copy link
Member

Tagging @opencontainers/runtime-tools-maintainers for any inputs.
I'm in favor of WASM being able to use the config media type if it works out of the box in other projects like containerd and there are no objections from runtime maintainers.

@sajayantony
Copy link
Member

@jsturtevant can you help DCO to pass?

@jsturtevant
Copy link
Author

jsturtevant commented Oct 10, 2023

It feels like artifactType is being used as a signal to wasm runtimes to know if/how the content should be executed. If we just need a text field, a well know annotation set on the manifest can also do that. There's also the platform variant field that we use to distinguish different versions/types of a specific platform.

I've iterated on this a few times so I've gotten it working with/without annotations/labels and with/without artifactType. Previous iterations of the spec here had me understanding that artifactType was the best way to signal there were layer types beyond the standard image layer types but isn't really necessary.

Side note is that, Containerd lets you pass in standard layers and other media types, so you could technically include specific Image layer types in the the WASM artifact and get a root file system that looks they way you want it and use the other artifact layers as the runtime (eg. containerd shim) sees fit. I don't think this is appropriate for WASM but I think there are some really interesting things you could potentially do with running images that also contain additional media types. I don't see a reason to restrict that type of potential.


The end goal is to be able to be able to publish WASM components/modules to OCI backend and then be able to consume them from a runtime (eg, containerd) and outside (eg, spin/wasm/cloud) as @thomastaylor312 points out.

Setting the config.mediaType allows us to use this today across both and start gathering feedback so we could potentially propose better WASM configuration type in the future and implement something in runtimes that handles a different configuration. Some of the details such as Env/entrypoint/volumes/workiDir/etc. might even be something the WASM runtime implementors could use.

Signed-off-by: James Sturtevant <jstur@microsoft.com>
@jsturtevant
Copy link
Author

@jsturtevant can you help DCO to pass?

@sajayantony Sorry about that, should be ok now.

@neersighted
Copy link

neersighted commented Oct 12, 2023

re: @tianon, but also to the manifest shown above as a whole.

I've actually been considering setting artifactType on my traditional images to make that even more clear; everything we store is an artifact, and some artifacts have a well defined meaning specified in this repository, namely indexes and [classic] images

I'd like to define an "image" as anything a runtime-spec based runtime might want to interpret, and an artifact as anything that should be opaque. From a data-model standpoint that is artifactType being present / the config.mediaType fallback.

Under the definition of an image, you can have your own layer types only a specialized runtime can interpret, and you can have custom annotations (in the descriptors) or even fields in the manifest, the config, etc.

I would like to add code to existing runtimes to ignore anything with an artifactType as it greatly simplifies content/platform discovery and prevents e.g. the need to populate a platform unknown/unknown to prevent "artifacts" (read: non-runnable content) being misinterpreted as an "image" (read: I might want a runtime to do something with this).

@tianon
Copy link
Member

tianon commented Oct 12, 2023

Functionally, that doesn't sound much different from "artifact type of image manifest, implied or explicit, is a container image" except that there can only be one "it isn't set" case vs any number of artifact types that might warrant special behavior in the future

@neersighted
Copy link

That's not the example above -- the example above is a custom artifactType with a OCI image config mediaType. Sure, you could set the image mediaType as the artifactType, but I'd rather not open that can of worms and simply check for the presence of artifactType.

I'm working on a PR clarifying the high-level/semantic meaning of 'artifact' and 'image'; and I hope to make a stronger clarification of "artifact" being detectable in code by checking the presence of artifactType/a custom mediaType as well.

@jsturtevant
Copy link
Author

Under the definition of an image, you can have your own layer types only a specialized runtime can interpret, and you can have custom annotations (in the descriptors) or even fields in the manifest, the config, etc.

Under this definition, if we remove the artifact type field from the manifest, this manifest would be acceptable? We don't really need the artifactType in this scenario but was trying to make sense of the specification and added it here since trying to signal that this isn't a "standard" image in the way it is being used.

I would like to add code to existing runtimes to ignore anything with an artifactType as it greatly simplifies content/platform discovery and prevents e.g. the need to populate a platform unknown/unknown to prevent "artifacts" (read: non-runnable content) being misinterpreted as an "image" (read: I might want a runtime to do something with this).

The semantics seem generally ok but seems like artifactType is not really needed as a field if what it is also required to have a cusom config.media type?

What about existing implementations that point to the fact that OCI artifacts with artifactTypes can be runnable? eg. https://github.com/sylabs/singularity

@jsturtevant
Copy link
Author

synced with @neersighted off-line and the conclusion is that we should remove the artifactType from manifest shown above and it would meet specs intent. There will be a couple PR's that will update the wording to clarify the distinction between an image and artifact.

In the future we may still create a WASM specific artifact but this would require changes to runtimes and we unclear on the exact config format that would meet all the various runtime needs until WASI compontents stabalize.

Summary of the conversation is in CNCF wg-wasm chat: https://cloud-native.slack.com/archives/C056EDRH4PJ/p1697491155258889

@neersighted
Copy link

I've opened #1141 to add what I believe to be the missing clarity the spec needs around these points.

To briefly summarize what I attempted to relay to @jsturtevant, I believe that the difference between "image" and "artifact" hinges on interpretation of the config blob. When you use the well-known OCI mediaType/format for the config (as in config.md), you have an image. If you do something else, you have an artifact.

Given the desire to use the existing config (such that existing code in containerd can understand it), I believe that the manifest the WASM folks have created is very much an image, and the only change they need to make (as shown) is to drop the artifactType.

@neersighted
Copy link

It's also worth noting that the above PR is not the end of the work in the image-spec to enable @jsturtevant and co; they are mixing OCI rootfs layers and non-rootfs layers in an image.

This seems like a pragmatic thing to support in the face of a real use-case, and I think we should codify something like containerd/containerd#9142 (or a variant) in the spec, such that we can define semantics for reconciling rootfs to layers (similar to the history object) and agree how to determine which layers should not be considered for rootfs/diffids validation.

@jsturtevant
Copy link
Author

I think we've come to a reasonable solution with a few followups in progress. I am going to close this now as discussion has moved to #1141

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants