New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New getting started doc #101
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,65 @@ | ||
# Kitfile Structure | ||
# Kitfile Structure | ||
|
||
The Kitfile defines the contents of your ModelKit. It is written in YAML and stored with the ModelKit. You can extract the Kitfile from any ModelKit with the Kit CLI: | ||
|
||
```sh | ||
kit unpack [registry/repo:tag] --config -d . | ||
``` | ||
|
||
There are four main parts to a Kitfile: | ||
1. ModelKit metadata in the `package` section | ||
1. Path to the Jupyter notebook folder in the `code` section | ||
1. Path to the serialized model in the `model` section | ||
1. Path to the datasets in the `datasets` section (you can have multiple datasets in the same page) | ||
|
||
Here's an example Kitfile: | ||
|
||
```yaml | ||
manifestVersion: v1.0.0 | ||
|
||
package: | ||
authors: | ||
- Jozu | ||
description: Updated model to analyze flight trait and passenger satisfaction data | ||
license: Apache-2.0 | ||
name: FlightSatML | ||
|
||
code: | ||
- description: Jupyter notebook with model training code in Python | ||
path: ./notebooks | ||
|
||
model: | ||
description: Flight satisfaction and trait analysis model using Scikit-learn | ||
framework: Scikit-learn | ||
license: Apache-2.0 | ||
name: joblib Model | ||
path: ./models/scikit_class_model_v2.joblib | ||
version: 1.0.0 | ||
|
||
datasets: | ||
- description: Flight traits and traveller satisfaction training data (tabular) | ||
name: training data | ||
path: ./data/train.csv | ||
- description: validation data (tabular) | ||
name: validation data | ||
path: ./data/test.csv | ||
``` | ||
|
||
The only mandatory parts of the Kitfile are: | ||
* `manifestVersion` | ||
* At least one of `code`, `model`, `or datasets` sections | ||
|
||
A ModelKit can only contain one model, but multiple datasets or code bases are allowed. Also note that you can only use relative paths (no absolute paths) in your Kitfile. Right now you can only build ModelKits from files on your local system...but don't worry we're already working towards allowing you to reference remote files. For example, building a ModelKit from a local notebook and model, but a dataset hosted on DvC, S3, or anywhere else. | ||
|
||
So a minimal ModelKit for distributing a pair of datasets might look like this: | ||
```yaml | ||
manifestVersion: v1.0.0 | ||
|
||
datasets: | ||
- name: training data | ||
path: ./data/train.csv | ||
- description: validation data (tabular) | ||
name: validation data | ||
path: ./data/test.csv | ||
``` | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this should be part of the Overview
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I realized that just now too. Reworked to do that and clean up a few other loose ends (including the nav).