Skip to content

JuliaMerz/pantry

Repository files navigation

An AI generated pair of smiling computers. Pantry — An LLM Manager

Pantry is a cross between Homebrew and Docker, for LLMs. It combines an LLM repository, a local LLM runner, and a remote API, accessed via UI or CLI.

Out of the box, Pantry includes one click downloads, custom sharapable configurations (with deep links!), the ability to run multiple LLMs in parallel, a UI, a CLI, and an API.

Getting Started

Just download one of the builds, use it to download an LLM, turn it on, and go.

You can either use the UI to access the LLM, use pantry path <llm_id> to get the LLM's local path and plug it into your favorite local LLM software, or use the pantry-rs api to integrate the LLMs into your own application. If you're feeling extra fancy you can use HTTP, though you'll have to use the docs.rs to figure out the API.

Usage

PantryGithub.mp4
Screen.Recording.2023-09-05.at.11.25.46.PM.mov

Currently Pantry is compatible with all LLMs supported by the rustformers/llm project. That's an ever-expanding set of LLMs based on the ggml project.

CLI

You'll need to add the CLI to your path in order to use it. The UI has instructions for doing so, or you can create an alias to the install location manually. Once you've done so, pantry --help will give you a list of commands.

By default, the CLI uses keychain based authentication to connect to your localhost Pantry instance. In order to use it an instance of pantry must already be running (you can close the window, it runs in your menubar).

You can set

PANTRY_CLI_TARGET
PANTRY_CLI_USER
PANTRY_CLI_KEY

to get rid of the keychain request, using the command pantry new_cli_user. You can also open the UI for more instructions.

The CLI currently does not allow you to query the LLM, you'll have to use either the UI or a program running pantry-rs or making http requests.

APIs

Pantry exposes an API via http-over-socket or localhost, at /tmp/pantrylocal.sock or at port 9404. Some (one) native APIs wrapping those access points also exist.

Running native code is extremely simple, here's an example from the rust API:

let perms = UserPermissions {
    perm_superuser: false,
    perm_load_llm: true,
    perm_unload_llm: true,
    perm_download_llm: false,
    perm_session: true,
    perm_request_download: true,
    perm_request_load: true,
    perm_request_unload: true,
    perm_view_llms: true,
};

let pantry = PantryClient::register("project_name".into(), perms).await.unwrap();


// Pause here and use the UI to accept the permission request.

// You'll need a running model, so you'll need to enable it in the UI or
// uncomment this if you want your code to handle the loading.
// pantry.load_llm_flex(None, None).await.unwrap();

// This will choose the running model with the highest capability score.
let sess = pantry.create_session(HashMap::new()).await.unwrap();

let recv = ses.prompt_session("About me: ".into(), HashMap::new()).await.unwrap();

Limitations

The system is currently great at running multiple LLMs at once, though obviously performance suffers. It unfortunately doesn't allow you to access the same LLM in parallel, because the model locks while it's running. This is only likely to be an issue if you're using the UI and the API at the same time, or if you're running multiple API programs accessing the same LLM at once.

How You Can Help

Add Models and Capability Evaluations

I tried to include a decent set of 'known-good' models in the default model repository. To be honest I've been too busy building this to spend a lot of time testing them, so if there's a ggml model you like, please pull request it. Help ranking the capabilities of existing or new models would also be appreciated.

Additional APIs

I've built a basic rust API just as a test of function. Improving it, or adding implementations in other languages, would go a long way.

Comprehensive Model Evaluation

Pantry should make it relatively simple to build software to more comprehensively evaluate and compare local LLMs for different use cases. This would be incredibly valuable, since it would allow the "capabilities" field to be based on more than just "vibes."

Testing/CI

I'd love to have proper regression testing and automated CI. I just haven't had the time to do it.

Backlog

  • OpenAI/Other Remote LLM Integration — The entire architecure is designed to allow this, and we're not currently taking advantage of it.
  • Non-Text Models
  • Better parallelism — currently the model locks during inference, leading to a potentially ugly queuing situation is a program is running an LLM in the background while the user is using a different program with an LLM.
  • Expand the CLI — currently limited to only basic commands.