AIKit ✨

AIKit is a one-stop shop to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open-source LLMs!
Fine Tuning: AIKit offers an extensible fine tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.

👉 For full documentation, please see AIKit website!

Features

🐳 No GPU, Internet access or additional tools needed except for Docker!
🤏 Minimal image size, resulting in less vulnerabilities and smaller attack surface with a custom distroless-based image
🎵 Fine tune support
🚀 Easy to use declarative configuration for inference and fine tuning
✨ OpenAI API compatible to use with any OpenAI API compatible client
📸 Multi-modal model support
🖼️ Image generation support with Stable Diffusion
🦙 Support for GGUF (llama), GPTQ (exllama or exllama2), EXL2 (exllama2), and GGML (llama-ggml) and Mamba models
🚢 Kubernetes deployment ready
📦 Supports multiple models with a single image
🖥️ Supports GPU-accelerated inferencing with NVIDIA GPUs
🔐 Signed images for aikit and pre-made models
🌈 Support for non-proprietary and self-hosted container registries to store model images

Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "llama-3-8b-instruct",
    "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
  }'

Output should be similar to:

{"created":1713494426,"object":"chat.completion","id":"fce01ee0-7b5a-452d-8f98-b6cb406a1067","model":"llama-3-8b-instruct","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

Model	Optimization	Parameters	Command	Model Name	License
🦙 Llama 3	Instruct	8B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b`	`llama-3-8b-instruct`	Llama
🦙 Llama 3	Instruct	70B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:70b`	`llama-3-70b-instruct`	Llama
🦙 Llama 2	Chat	7B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:7b`	`llama-2-7b-chat`	Llama
🦙 Llama 2	Chat	13B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:13b`	`llama-2-13b-chat`	Llama
Ⓜ️ Mixtral	Instruct	8x7B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b`	`mixtral-8x7b-instruct`	Apache
🅿️ Phi 3	Instruct	3.8B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b`	`phi-3-3.8b`	MIT

NVIDIA CUDA

Model	Optimization	Parameters	Command	Model Name	License
🦙 Llama 3	Instruct	8B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:8b-cuda`	`llama-3-8b-instruct`	Llama
🦙 Llama 3	Instruct	70B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:70b-cuda`	`llama-3-70b-instruct`	Llama
🦙 Llama 2	Chat	7B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:7b-cuda`	`llama-2-7b-chat`	Llama
🦙 Llama 2	Chat	13B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:13b-cuda`	`llama-2-13b-chat`	Llama
Ⓜ️ Mixtral	Instruct	8x7B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b-cuda`	`mixtral-8x7b-instruct`	Apache
🅿️ Phi 3	Instruct	3.8B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b-cuda`	`phi-3-3.8b`	MIT

What's next?

👉 For more information and how to fine tune models or create your own images, please see AIKit website!

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.github		.github
cmd/frontend		cmd/frontend
demo		demo
kubernetes		kubernetes
models		models
pkg		pkg
test		test
website		website
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum

License

sozercan/aikit

Folders and files

Latest commit

History

Repository files navigation

AIKit ✨

Features

Quick Start

Pre-made Models

CPU

NVIDIA CUDA

What's next?

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Languages