From 01844dc0247abace1b812495426bddaaa30fa284 Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Thu, 7 Mar 2024 09:29:25 -0500 Subject: [PATCH 1/3] kitfile overview also rearranges the modelkit spec. Updates and removes a few unused files --- docs/.vitepress/config.mts | 1 - docs/src/docs/cli/usage.md | 77 ----------------------- docs/src/docs/kitfile/benefits.md | 1 - docs/src/docs/kitfile/kf-overview.md | 24 ++++++- docs/src/docs/modelkit/ModelKit_chart.svg | 23 +++++++ docs/src/docs/modelkit/intro.md | 2 + docs/src/docs/modelkit/spec.md | 50 +-------------- pkg/artifact/spec.md | 49 +++++++++++++++ 8 files changed, 97 insertions(+), 130 deletions(-) delete mode 100644 docs/src/docs/cli/usage.md delete mode 100644 docs/src/docs/kitfile/benefits.md create mode 100644 docs/src/docs/modelkit/ModelKit_chart.svg create mode 100644 pkg/artifact/spec.md diff --git a/docs/.vitepress/config.mts b/docs/.vitepress/config.mts index 30fd87e5..0412c09f 100644 --- a/docs/.vitepress/config.mts +++ b/docs/.vitepress/config.mts @@ -67,7 +67,6 @@ export default defineConfig({ items: [ { text: 'Overview', link: '/docs/kitfile/kf-overview' }, { text: 'Format', link: '/docs/kitfile/format.md' }, - { text: 'Benefits', link: '/docs/kitfile/benefits' }, ] }, { diff --git a/docs/src/docs/cli/usage.md b/docs/src/docs/cli/usage.md deleted file mode 100644 index ef939295..00000000 --- a/docs/src/docs/cli/usage.md +++ /dev/null @@ -1,77 +0,0 @@ -# Kit CLI tool - -The `kit CLI` is a tool to easily and quickly manage models. - -## Usage - -```sh -$ ./kit [command] -``` - -You can always get help on a command by adding the `-h` flag. - -Available Commands: - -| Command | Description | -| ---- | --- | -| `pack` | Build a ModelKit | -| `completion` | Generate the autocompletion script for the specified shell | -| `dev` | Run the serialized model | -| `fetch` | Updating the local respository for a ModelKit from the remote | -| `help` | Help about any command | -| `list` | List ModelKits | -| `login` | Login to the remote repository | -| `logout` | Logout to the remote repository | -| `pull` | Pull one or more of the model, dataset, code, and Kitfile into a destination folder | -| `push` | Push ModelKit to respository | -| `remove` | Removed the ModelKit from the local repository | -| `tag` | Tags a ModelKit | -| `version` | Display the version information for Kit | - -## A Few Examples - -To list your available ModelKit: - -```sh -$ ./kit list -``` - -To build a ModelKit for your model and tag it with `example-tag`: - -```sh -$ ./kit pack ../examples/onnx -t localhost:5050/example-repo:example-tag" -``` - -Then you can push it to your registry: - -```sh -$ ./kit push localhost:5050/example-repo:example-tag --http -``` - -After you finish calling all your friends and telling them about Kit, they will want to fetch your ModelKit and run it. The `fetch` command is used to bring everything in the ModelKit to your local machine - the model, dataset(s), code, and the [Kitfile](../kitfile/kf-overview.md) manifest. - -```sh -$ ./kit fetch localhost:5050/test-repo:test-tag --http -``` - -However, Kit is a *modular package* so if someone only needs the model they can `pull` only that part: - -```sh -$ ./kit pull -filter model -``` - -Or just the dataset: - -```sh -$ ./kit pull -filter dataset -``` - -You can also use `pull` to filer for the `code` or the Kitfile `manifest`. When you pull any filtered part of a ModelKit you always get the Kitfile as well. - -The `dev` command will automatically generate a RESTful API for the model and then run the model and API locally so anyone can use the model: - -```sh -$ ./kit dev -``` - -So with a few easy commands you've been able to package up a model, share it with others, and run it locally...and you never needed to learn Dockerfile syntax or how to deal with a Helm chart or other proprietary packaging method. diff --git a/docs/src/docs/kitfile/benefits.md b/docs/src/docs/kitfile/benefits.md deleted file mode 100644 index ce74158b..00000000 --- a/docs/src/docs/kitfile/benefits.md +++ /dev/null @@ -1 +0,0 @@ -# Benefits diff --git a/docs/src/docs/kitfile/kf-overview.md b/docs/src/docs/kitfile/kf-overview.md index 5c994f72..e4985814 100644 --- a/docs/src/docs/kitfile/kf-overview.md +++ b/docs/src/docs/kitfile/kf-overview.md @@ -1,3 +1,23 @@ -# Kitfiles +# Kitfile: Your AI/ML Project Blueprint -A Kitfile is a manifest showing what is in \ No newline at end of file +## What is a Kitfile? + +At the core of every AI/ML project managed by KitOps lies the Kitfile, a YAML-based manifest designed to streamline the encapsulation and sharing of project artifacts. From code and datasets to models and their metadata, the Kitfile serves as a comprehensive blueprint for your project, ensuring every component is meticulously organized and easily accessible. + +## Structured for Clarity + +Crafted with simplicity and efficiency in mind, the Kitfile organizes project details into distinct sections: + +**Project Metadata:** Offers a snapshot of your project, including its name, version, description, and authors, laying the foundation for collaboration and recognition. + +**Code:** Details about the source code powering your AI/ML models, complete with licensing information to uphold software best practices. + +**Datasets:** Descriptions and paths to datasets, highlighting preprocessing steps and licenses, to ensure reproducibility and ethical use of data. + +**Model Specifications:** Insights into the models themselves, including framework details, training parameters, and validation metrics, to foster understanding and further development. + +## Designed for Collaboration + +By encapsulating the essence of your AI/ML project into a singular, version-controlled document, the Kitfile not only simplifies the packaging process but also enhances collaborative efforts. Whether you're sharing projects within your team or with the global AI/ML community, the Kitfile ensures that every artifact, from datasets to models, is accurately represented and easily accessible. + +Embrace the Kitfile in your AI/ML projects to harness the power of structured packaging, efficient collaboration, and seamless artifact management. As the backbone of the KitOps ecosystem, the Kitfile is your first step towards simplifying AI/ML project management and achieving greater innovation. diff --git a/docs/src/docs/modelkit/ModelKit_chart.svg b/docs/src/docs/modelkit/ModelKit_chart.svg new file mode 100644 index 00000000..132c98f9 --- /dev/null +++ b/docs/src/docs/modelkit/ModelKit_chart.svg @@ -0,0 +1,23 @@ + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/src/docs/modelkit/intro.md b/docs/src/docs/modelkit/intro.md index 859d4eb2..2fd9a8c0 100644 --- a/docs/src/docs/modelkit/intro.md +++ b/docs/src/docs/modelkit/intro.md @@ -1,5 +1,7 @@ # ModelKit Overview +![ModelKit](./ModelKit_chart.svg) + ModelKit revolutionizes the way AI/ML artifacts are shared and managed throughout the lifecycle of AI/ML projects. As an OCI-compliant packaging format, ModelKit encapsulates datasets, code, configurations, and models into a single, standardized unit. This approach not only streamlines the development process but also ensures broad compatibility and integration with a vast array of tools and platforms. ## Key Features of ModelKit: diff --git a/docs/src/docs/modelkit/spec.md b/docs/src/docs/modelkit/spec.md index 45790a7c..a2d038a8 100644 --- a/docs/src/docs/modelkit/spec.md +++ b/docs/src/docs/modelkit/spec.md @@ -1,49 +1 @@ -# ModelKit Specification v0.1 - -A **ModelKit** represents a comprehensive bundle of AI/ML artifacts, including models, datasets, and code, along with their associated parameters. These components are crucial at various stages of a model's lifecycle. This specification details the format and organization of these artifacts and parameters, providing guidelines for their creation, management, and use. - -## Terminology and Structure - - - -**Artifacts:** The building blocks of a ModelKit. Artifacts can be models, datasets, or code, each stored and addressed individually. This modular approach facilitates direct access via tools. Artifact metadata is encapsulated within the kitfile, ensuring comprehensive documentation of each component. - -The artifacts and their media types are -* Serialized Model: `application/vnd.kitops.modelkit.model.v1.tar+gzip` -* Datasets: `application/vnd.kitops.modelkit.dataset.v1.tar+gzip` -* Code: `application/vnd.kitops.modelkit.code.v1.tar+gzip` - -**ModelKit File (Kitfile)** Acts as a record detailing the properties, relationships, and intended uses of the included artifacts. The Kitfile is central to understanding the structure and purpose of a ModelKit. It adopts the `application/vnd.kitops.modelkit.config.v1+json` media type for easy access and interpretation by tools.See the seperate kitfile specification on details - -**ModelKit Manifest:** This JSON document provides essential information about the model, including creation date, authorship, and a cryptographic hash of each artifact and the Kitfile. The manifest is immutable to preserve the integrity of the ModelKitID, ensuring any modification results in the creation of a new derived ModelKit, rather than altering the existing one. - -### Identification and Management: - -**ModelKitID:** A unique identifier for each ModelKit, derived from the SHA256 hash of its manifest. For example, `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. - -**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters [a-zA-Z0-9_.-], except they may not start with a . or - character. Tags are limited to 128 characters. - -**Repository:** A collection of tags grouped under a common prefix (the name component before :). For example, in a ModelKit tagged with the name myllm:3.1.4, myllm is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. - - -## ModelKit Manifest Example - -Example of a ModelKit manifest with a single serialized model and kitfile. - -```JSON -{ - "schemaVersion": 2, - "config": { - "mediaType": "application/vnd.jozu.model.config.v1+json", - "digest": "sha256:d5815835051dd97d800a03f641ed8162877920e734d3d705b698912602b8c763", - "size": 301 - }, - "layers": [ - { - "mediaType": "application/vnd.jozu.model.content.v1.tar+gzip", - "digest": "sha256:3f907c1a03bf20f20355fe449e18ff3f9de2e49570ffb536f1a32f20c7179808", - "size": 30327160 - } - ] -} -``` + diff --git a/pkg/artifact/spec.md b/pkg/artifact/spec.md new file mode 100644 index 00000000..45790a7c --- /dev/null +++ b/pkg/artifact/spec.md @@ -0,0 +1,49 @@ +# ModelKit Specification v0.1 + +A **ModelKit** represents a comprehensive bundle of AI/ML artifacts, including models, datasets, and code, along with their associated parameters. These components are crucial at various stages of a model's lifecycle. This specification details the format and organization of these artifacts and parameters, providing guidelines for their creation, management, and use. + +## Terminology and Structure + + + +**Artifacts:** The building blocks of a ModelKit. Artifacts can be models, datasets, or code, each stored and addressed individually. This modular approach facilitates direct access via tools. Artifact metadata is encapsulated within the kitfile, ensuring comprehensive documentation of each component. + +The artifacts and their media types are +* Serialized Model: `application/vnd.kitops.modelkit.model.v1.tar+gzip` +* Datasets: `application/vnd.kitops.modelkit.dataset.v1.tar+gzip` +* Code: `application/vnd.kitops.modelkit.code.v1.tar+gzip` + +**ModelKit File (Kitfile)** Acts as a record detailing the properties, relationships, and intended uses of the included artifacts. The Kitfile is central to understanding the structure and purpose of a ModelKit. It adopts the `application/vnd.kitops.modelkit.config.v1+json` media type for easy access and interpretation by tools.See the seperate kitfile specification on details + +**ModelKit Manifest:** This JSON document provides essential information about the model, including creation date, authorship, and a cryptographic hash of each artifact and the Kitfile. The manifest is immutable to preserve the integrity of the ModelKitID, ensuring any modification results in the creation of a new derived ModelKit, rather than altering the existing one. + +### Identification and Management: + +**ModelKitID:** A unique identifier for each ModelKit, derived from the SHA256 hash of its manifest. For example, `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. + +**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters [a-zA-Z0-9_.-], except they may not start with a . or - character. Tags are limited to 128 characters. + +**Repository:** A collection of tags grouped under a common prefix (the name component before :). For example, in a ModelKit tagged with the name myllm:3.1.4, myllm is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. + + +## ModelKit Manifest Example + +Example of a ModelKit manifest with a single serialized model and kitfile. + +```JSON +{ + "schemaVersion": 2, + "config": { + "mediaType": "application/vnd.jozu.model.config.v1+json", + "digest": "sha256:d5815835051dd97d800a03f641ed8162877920e734d3d705b698912602b8c763", + "size": 301 + }, + "layers": [ + { + "mediaType": "application/vnd.jozu.model.content.v1.tar+gzip", + "digest": "sha256:3f907c1a03bf20f20355fe449e18ff3f9de2e49570ffb536f1a32f20c7179808", + "size": 30327160 + } + ] +} +``` From 0b4b082c694ac1597cd05fadda9bdc309bb8f9bf Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Thu, 7 Mar 2024 10:55:40 -0500 Subject: [PATCH 2/3] remove extra lines Co-authored-by: Angel Misevski --- pkg/artifact/spec.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/pkg/artifact/spec.md b/pkg/artifact/spec.md index 45790a7c..75847779 100644 --- a/pkg/artifact/spec.md +++ b/pkg/artifact/spec.md @@ -4,8 +4,6 @@ A **ModelKit** represents a comprehensive bundle of AI/ML artifacts, including m ## Terminology and Structure - - **Artifacts:** The building blocks of a ModelKit. Artifacts can be models, datasets, or code, each stored and addressed individually. This modular approach facilitates direct access via tools. Artifact metadata is encapsulated within the kitfile, ensuring comprehensive documentation of each component. The artifacts and their media types are From aca00ca4a4b7edb933822b1e4f1286b2161e7b7c Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Thu, 7 Mar 2024 10:56:50 -0500 Subject: [PATCH 3/3] Inline code blocks for readability Co-authored-by: Angel Misevski --- pkg/artifact/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pkg/artifact/spec.md b/pkg/artifact/spec.md index 75847779..89495f77 100644 --- a/pkg/artifact/spec.md +++ b/pkg/artifact/spec.md @@ -19,9 +19,9 @@ The artifacts and their media types are **ModelKitID:** A unique identifier for each ModelKit, derived from the SHA256 hash of its manifest. For example, `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. -**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters [a-zA-Z0-9_.-], except they may not start with a . or - character. Tags are limited to 128 characters. +**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters `[a-zA-Z0-9_.-]`, except they may not start with a `.` or `-` character. Tags are limited to 128 characters. -**Repository:** A collection of tags grouped under a common prefix (the name component before :). For example, in a ModelKit tagged with the name myllm:3.1.4, myllm is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. +**Repository:** A collection of tags grouped under a common prefix (the name component before `:`). For example, in a ModelKit tagged with the name `myllm:3.1.4`, `myllm` is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain `_` characters. If a hostname is present, it may optionally be followed by a port number in the format `:8080`. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. ## ModelKit Manifest Example