New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kitfile overview #70
kitfile overview #70
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,23 @@ | ||
# Kitfiles | ||
# Kitfile: Your AI/ML Project Blueprint | ||
|
||
A Kitfile is a manifest showing what is in | ||
## What is a Kitfile? | ||
|
||
At the core of every AI/ML project managed by KitOps lies the Kitfile, a YAML-based manifest designed to streamline the encapsulation and sharing of project artifacts. From code and datasets to models and their metadata, the Kitfile serves as a comprehensive blueprint for your project, ensuring every component is meticulously organized and easily accessible. | ||
|
||
## Structured for Clarity | ||
|
||
Crafted with simplicity and efficiency in mind, the Kitfile organizes project details into distinct sections: | ||
|
||
**Project Metadata:** Offers a snapshot of your project, including its name, version, description, and authors, laying the foundation for collaboration and recognition. | ||
|
||
**Code:** Details about the source code powering your AI/ML models, complete with licensing information to uphold software best practices. | ||
|
||
**Datasets:** Descriptions and paths to datasets, highlighting preprocessing steps and licenses, to ensure reproducibility and ethical use of data. | ||
|
||
**Model Specifications:** Insights into the models themselves, including framework details, training parameters, and validation metrics, to foster understanding and further development. | ||
|
||
## Designed for Collaboration | ||
|
||
By encapsulating the essence of your AI/ML project into a singular, version-controlled document, the Kitfile not only simplifies the packaging process but also enhances collaborative efforts. Whether you're sharing projects within your team or with the global AI/ML community, the Kitfile ensures that every artifact, from datasets to models, is accurately represented and easily accessible. | ||
|
||
Embrace the Kitfile in your AI/ML projects to harness the power of structured packaging, efficient collaboration, and seamless artifact management. As the backbone of the KitOps ecosystem, the Kitfile is your first step towards simplifying AI/ML project management and achieving greater innovation. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1 @@ | ||
# ModelKit Specification v0.1 | ||
|
||
A **ModelKit** represents a comprehensive bundle of AI/ML artifacts, including models, datasets, and code, along with their associated parameters. These components are crucial at various stages of a model's lifecycle. This specification details the format and organization of these artifacts and parameters, providing guidelines for their creation, management, and use. | ||
|
||
## Terminology and Structure | ||
|
||
|
||
|
||
**Artifacts:** The building blocks of a ModelKit. Artifacts can be models, datasets, or code, each stored and addressed individually. This modular approach facilitates direct access via tools. Artifact metadata is encapsulated within the kitfile, ensuring comprehensive documentation of each component. | ||
|
||
The artifacts and their media types are | ||
* Serialized Model: `application/vnd.kitops.modelkit.model.v1.tar+gzip` | ||
* Datasets: `application/vnd.kitops.modelkit.dataset.v1.tar+gzip` | ||
* Code: `application/vnd.kitops.modelkit.code.v1.tar+gzip` | ||
|
||
**ModelKit File (Kitfile)** Acts as a record detailing the properties, relationships, and intended uses of the included artifacts. The Kitfile is central to understanding the structure and purpose of a ModelKit. It adopts the `application/vnd.kitops.modelkit.config.v1+json` media type for easy access and interpretation by tools.See the seperate kitfile specification on details | ||
|
||
**ModelKit Manifest:** This JSON document provides essential information about the model, including creation date, authorship, and a cryptographic hash of each artifact and the Kitfile. The manifest is immutable to preserve the integrity of the ModelKitID, ensuring any modification results in the creation of a new derived ModelKit, rather than altering the existing one. | ||
|
||
### Identification and Management: | ||
|
||
**ModelKitID:** A unique identifier for each ModelKit, derived from the SHA256 hash of its manifest. For example, `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. | ||
|
||
**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters [a-zA-Z0-9_.-], except they may not start with a . or - character. Tags are limited to 128 characters. | ||
|
||
**Repository:** A collection of tags grouped under a common prefix (the name component before :). For example, in a ModelKit tagged with the name myllm:3.1.4, myllm is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. | ||
|
||
|
||
## ModelKit Manifest Example | ||
|
||
Example of a ModelKit manifest with a single serialized model and kitfile. | ||
|
||
```JSON | ||
{ | ||
"schemaVersion": 2, | ||
"config": { | ||
"mediaType": "application/vnd.jozu.model.config.v1+json", | ||
"digest": "sha256:d5815835051dd97d800a03f641ed8162877920e734d3d705b698912602b8c763", | ||
"size": 301 | ||
}, | ||
"layers": [ | ||
{ | ||
"mediaType": "application/vnd.jozu.model.content.v1.tar+gzip", | ||
"digest": "sha256:3f907c1a03bf20f20355fe449e18ff3f9de2e49570ffb536f1a32f20c7179808", | ||
"size": 30327160 | ||
} | ||
] | ||
} | ||
``` | ||
<!--@include: ../../../../pkg/artifact/spec.md--> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of curiosity, is there a reason why this (and the other pkg md's) lives outside the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is just close to where those spec are implemented. Makes it convenient for anyone who needs to modify or understand those implementations There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This document makes no reference to the OCI spec -- do we want to include mention there? Otherwise, it might be confusing to see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we should refer to the OCI spec for this section. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,47 @@ | ||||||
# ModelKit Specification v0.1 | ||||||
|
||||||
A **ModelKit** represents a comprehensive bundle of AI/ML artifacts, including models, datasets, and code, along with their associated parameters. These components are crucial at various stages of a model's lifecycle. This specification details the format and organization of these artifacts and parameters, providing guidelines for their creation, management, and use. | ||||||
|
||||||
## Terminology and Structure | ||||||
|
||||||
**Artifacts:** The building blocks of a ModelKit. Artifacts can be models, datasets, or code, each stored and addressed individually. This modular approach facilitates direct access via tools. Artifact metadata is encapsulated within the kitfile, ensuring comprehensive documentation of each component. | ||||||
|
||||||
The artifacts and their media types are | ||||||
* Serialized Model: `application/vnd.kitops.modelkit.model.v1.tar+gzip` | ||||||
* Datasets: `application/vnd.kitops.modelkit.dataset.v1.tar+gzip` | ||||||
* Code: `application/vnd.kitops.modelkit.code.v1.tar+gzip` | ||||||
Comment on lines
+9
to
+12
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We're somewhat overloading the term There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Layers are confusing too becuase they suggest model kit is built incrementally with layers. I agree that artifact is overused on this context but I could not find a better word to describe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could refer to them as packages which would fit with |
||||||
|
||||||
**ModelKit File (Kitfile)** Acts as a record detailing the properties, relationships, and intended uses of the included artifacts. The Kitfile is central to understanding the structure and purpose of a ModelKit. It adopts the `application/vnd.kitops.modelkit.config.v1+json` media type for easy access and interpretation by tools.See the seperate kitfile specification on details | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Also, should we merge the kitfile specification and |
||||||
|
||||||
**ModelKit Manifest:** This JSON document provides essential information about the model, including creation date, authorship, and a cryptographic hash of each artifact and the Kitfile. The manifest is immutable to preserve the integrity of the ModelKitID, ensuring any modification results in the creation of a new derived ModelKit, rather than altering the existing one. | ||||||
|
||||||
### Identification and Management: | ||||||
|
||||||
**ModelKitID:** A unique identifier for each ModelKit, derived from the SHA256 hash of its manifest. For example, `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`. | ||||||
|
||||||
**Tag:** A tag serves to map a descriptive, user-given name to any single modelKitID. Tag values are limited to the set of characters `[a-zA-Z0-9_.-]`, except they may not start with a `.` or `-` character. Tags are limited to 128 characters. | ||||||
|
||||||
**Repository:** A collection of tags grouped under a common prefix (the name component before `:`). For example, in a ModelKit tagged with the name `myllm:3.1.4`, `myllm` is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain `_` characters. If a hostname is present, it may optionally be followed by a port number in the format `:8080`. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator. | ||||||
|
||||||
|
||||||
## ModelKit Manifest Example | ||||||
|
||||||
Example of a ModelKit manifest with a single serialized model and kitfile. | ||||||
|
||||||
```JSON | ||||||
{ | ||||||
"schemaVersion": 2, | ||||||
"config": { | ||||||
"mediaType": "application/vnd.jozu.model.config.v1+json", | ||||||
"digest": "sha256:d5815835051dd97d800a03f641ed8162877920e734d3d705b698912602b8c763", | ||||||
"size": 301 | ||||||
}, | ||||||
"layers": [ | ||||||
{ | ||||||
"mediaType": "application/vnd.jozu.model.content.v1.tar+gzip", | ||||||
"digest": "sha256:3f907c1a03bf20f20355fe449e18ff3f9de2e49570ffb536f1a32f20c7179808", | ||||||
"size": 30327160 | ||||||
} | ||||||
] | ||||||
} | ||||||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW Vitepress imports also supports importing just a range of lines, like:
that would only import from line 2 to line 10. Not sure how helpful that is but is good to know.