Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New getting started doc #101

Merged
merged 3 commits into from Mar 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
19 changes: 10 additions & 9 deletions README.md
@@ -1,29 +1,26 @@
# Welcome to KitOps 🚀

## Unleashing the Power of Streamlined Collaboration for AI/ML Projects
## Streamlined Collaboration for AI/ML Projects

KitOps is your ultimate toolkit for transforming how you build, package, and deploy AI/ML models. Say goodbye to compatibility concerns and hello to smooth model sharing.
KitOps is your toolkit for transforming how you package, share, and deploy AI/ML models. Say goodbye to compatibility concerns and hello to smooth AI/ML collaboration.

KitOps is designed to enhance collaboration among data scientists, application developers, and SREs working on self-hosted AI/ML models. KitOps' ModelKits simplify packaging models with their dependencies, configurations, and environments. The ModelKit is portable and uses open standards for compatibility.
KitOps simplifies the handoffs between data scientists, application developers, and SREs working on self-hosted AI/ML models (including LLMs). KitOps' ModelKits create a unified package for models, their dependencies, configurations, and environments. The ModelKit is portable and uses open standards for compatibility with the tools you already use.

### What is in the box?

**ModelKit:** At the heart of KitOps is the ModelKit, an OCI-compliant packaging format that enables the seamless sharing of all necessary artifacts involved in the AI/ML model lifecycle. This includes datasets, code, configurations, and the models themselves. By standardizing the way these components are packaged, ModelKit facilitates a more streamlined and collaborative development process that is compatible with nearly any tool.

**Kitfile:** Complementing the ModelKit is the Kitfile, AI/ML project's blueprint, a YAML-based configuration file that simplifies the sharing of model, dataset, and code configurations. The Kitfile is designed with both ease of use and security in mind, ensuring that configurations can be efficiently packaged and shared without compromising on safety or governance.
**Kitfile:** Complementing the ModelKit is the Kitfile, your AI/ML project's blueprint. It's a YAML-based configuration file that simplifies the sharing of model, dataset, and code configurations. Kitfiles are designed with both ease of use and security in mind, ensuring that configurations can be efficiently packaged and shared without compromising on safety or governance.

**Kit CLI:** Your magic wand for AI/ML collaboration. The Kit CLI is a powerful tool that enables users to create, manage, run, and deploy ModelKits using Kitfiles. Whether you are packaging a new model for development or deploying an existing model into production, the Kit CLI provides the necessary commands and functionalities to streamline your workflow.

KitOps enhances the end-to-end lifecycle of AI/ML model management, making it as streamlined as managing containerized applications. It's about enhancing collaboration, streamlining processes, and unlocking a world of possibilities for AI/ML innovation.
**Kit CLI:** Your magic wand for AI/ML collaboration. The Kit CLI not only enables users to create, manage, run, and deploy ModelKits... it lets you pull only the pieces you need. Just need the serialized model for deployment? Use `unpack --model` or maybe you just want the training datasets? `unpack --datasets`. So, whether you are packaging a new model for development or deploying an existing model into production, the Kit CLI provides the flexibility and power to streamline your workflow.

## Quick Start with Kit

Dive into the world of KitOps with ease! Whether you're looking to streamline your AI/ML workflows or explore the power of ModelKits, getting started with Kit is straightforward.

### Running Kit with Pre-built Binaries


Get started with the Kit CLI by downloading a pre-built binary. Choose the `latest` [tagged version](https://github.com/jozu-ai/kitops/tags) for the most stable release, or explore the `next` tag for our development builds.
First, download the Kit CLI. Choose the `latest` [tagged version](https://github.com/jozu-ai/kitops/tags) for the most stable release, or explore the `next` tag for our development builds.

For installation instructions and selecting the right binary for your platform, please refer to our [Installation Guide](./docs/src/docs/cli/installation.md).

Expand Down Expand Up @@ -62,6 +59,10 @@ Or, for direct execution during development:
go run .
```

## Using Kit

The easiest way to get introduced to Kit is with our [getting started guide](./docs/src/docs/getting-started.md)

## Your Voice Matters

### Reporting Issues and Suggesting Features
Expand Down
6 changes: 6 additions & 0 deletions docs/src/docs/getting-started.md
Expand Up @@ -5,6 +5,12 @@ In this guide, we'll use ModelKits and the kit CLI to easily:
* Push the ModelKit package to a public or private registry
* Grab just the things you need from the ModelKit for testing, integration, local running, or deployment

## Before we start...

Make sure you've got the Kit CLI setup on your machine. Our [installation instructions](./cli/installation.md) will help.

We recommend starting by pulling one of our [example ModelKits](https://github.com/orgs/jozu-ai/packages) to your machine and going through this getting started. From there you can try [writing a Kitfile](./kitfile/format.md) for your own AI/ML project.

## Preparing for packaging

The first step is to make a `Kitfile` - a YAML manifest for your ModelKit. There are four main parts to a Kitfile:
Expand Down
66 changes: 65 additions & 1 deletion docs/src/docs/kitfile/structure.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this should be part of the Overview

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I realized that just now too. Reworked to do that and clean up a few other loose ends (including the nav).

@@ -1 +1,65 @@
# Kitfile Structure
# Kitfile Structure

The Kitfile defines the contents of your ModelKit. It is written in YAML and stored with the ModelKit. You can extract the Kitfile from any ModelKit with the Kit CLI:

```sh
kit unpack [registry/repo:tag] --config -d .
```

There are four main parts to a Kitfile:
1. ModelKit metadata in the `package` section
1. Path to the Jupyter notebook folder in the `code` section
1. Path to the serialized model in the `model` section
1. Path to the datasets in the `datasets` section (you can have multiple datasets in the same page)

Here's an example Kitfile:

```yaml
manifestVersion: v1.0.0

package:
authors:
- Jozu
description: Updated model to analyze flight trait and passenger satisfaction data
license: Apache-2.0
name: FlightSatML

code:
- description: Jupyter notebook with model training code in Python
path: ./notebooks

model:
description: Flight satisfaction and trait analysis model using Scikit-learn
framework: Scikit-learn
license: Apache-2.0
name: joblib Model
path: ./models/scikit_class_model_v2.joblib
version: 1.0.0

datasets:
- description: Flight traits and traveller satisfaction training data (tabular)
name: training data
path: ./data/train.csv
- description: validation data (tabular)
name: validation data
path: ./data/test.csv
```

The only mandatory parts of the Kitfile are:
* `manifestVersion`
* At least one of `code`, `model`, `or datasets` sections

A ModelKit can only contain one model, but multiple datasets or code bases are allowed. Also note that you can only use relative paths (no absolute paths) in your Kitfile. Right now you can only build ModelKits from files on your local system...but don't worry we're already working towards allowing you to reference remote files. For example, building a ModelKit from a local notebook and model, but a dataset hosted on DvC, S3, or anywhere else.

So a minimal ModelKit for distributing a pair of datasets might look like this:
```yaml
manifestVersion: v1.0.0

datasets:
- name: training data
path: ./data/train.csv
- description: validation data (tabular)
name: validation data
path: ./data/test.csv
```