[WIP] Elixir code completion #2332

jonastemplestein · 2023-11-09T15:47:41Z

The aim of this PR is to eventually offer Elixir inline code completion within Livebook.

The high level design is like this

When users stop typing in a code cell, they will see a ghost text suggesting what they might want to type next. Hitting tab inserts that text
We will train a code model (using the python ecosystem)
We will run inference in livebook using either bumblebee (in case of beefy GPU with large memory) or otherwise a llama.cpp NIF (in which case we could use quantised models and CPU inference)
Users will be able to select which model to run and livebook downloads it for them

For more context on the project, see this this random document with (slightly outdated) notes.

Status

This is a very minimal implementation of copilot style code completion.

At the moment the only LLM supported is the GPT4 API. Set OPENAI_API_KEY env var to play around with it.

Inline completion should appear 500ms after you stop typing. Or use Ctrl + Space to force inline completion to appear.

TODO in Livebook

Frontend polish

Don't debounce completion when keyboard shortcut is used
Implement stop words logic like Tabby does
Better deal with line-breaks (e.g. when the infilled code is meant to start with a newline when the cursor is at the end of a comment line)
Don't show completion in certain situations (e.g. empty editor, cursor at the beginning of non-empty line, etc)
Completion currently doesn't show when there is an intellisense suggestion. I'd try to make these independent and have [tab] always do code completion and [return] always accept the intellisense suggestion (like the cursor editor does it)

Livebook plumbing

Allow user to select model and download that model
Communicate what's going on with user while model is a) being downloaded and b) being loaded (and error states)
Give context from the whole livebook (not just current cell) to the LLM
Completion cache to avoid hitting the LLM unnecessarily

Model inference

Llama.cpp HTTP API for rapid prototyping using local llama.cpp
Llama.cpp nif for inference
Bumblebee model implementation

Tests!

TODO for fine-tuning a model

The hardest task is to actually fine-tune a model

Acquire large amounts of elixir code (and remove PII etc - could perhaps use The Stack and the source code of elixir core and a few other open source projects)
Turn the code into fill-in-the-middle training examples (can use GPT4 to e.g. generate comments to be used in those examples)
Fine tune a bunch of different models to see which one performs best (be mindful they each have different infilling formats - this document has a list of LLMs to evaluate
Create mechanism for evaluating fill in the model performance
Implement bumblebee model loaders for most promising models
Produce model files to be used by livebook - both for bumblebee (HF transformers format) and quantised GGUF

One of the most fiddly bits seems to be to properly tokenise the special infilling tokens (both in bumblebee and llama.cpp). This seems a bit fiddly and the models often output garbage if you get this wrong. There is some good context on these llama.cpp threads [1] [2]

Set OPENAI_API_KEY env var to play around with it

CLAassistant · 2023-11-09T15:47:48Z

All committers have signed the CLA.

jonastemplestein · 2023-11-09T15:48:23Z

Ah sorry, I meant to open this under my own fork. Shall I move it there?

josevalim · 2023-11-09T16:31:31Z

Feel free to leave it here for people to play with :)

This means you can now use any model for completion that llama.cpp can run Just compile llama.cpp and run the server like this: ./server -m codellama-7b.Q5_K_M.gguf -c 4096 I've tested this with codellama 7B quantised (codellama-7b.Q5_K_M.gguf) and it works well. But I have no idea if the special `/infill` endpoint works for other models, as I don't know how llama.cpp would know about the infilling tokens

- Refactored the way copilot completion backends work - Added Livebook.Copilot.BumblebeeBackend (including attempting to run Serving under new DynamicSupervisor) - Added Livebook.Copilot.DummyBackend for testing - Added Livebook.Copilot.LlamaCppHttpBackend for running models in llama.cpp's server locally - Added Livebook.Copilot.OpenaiBackend for running on OpenAi - Added Livebook.Copilot.HuggingfaceBackend to use HF inference endpoints - Played around with adding some user feedback via flash messages - Fixed a whole bunch of edge cases and bugs in client side logic - Request completions instantly (instead of debounced) when manually requested - Added special comments you can put in livebook cells that will override the configured copilot backend

Reverting to GPT2

jonastemplestein · 2023-11-27T08:38:51Z

Just to give a little update on this:

I think we will most likely want to use a fine-tune of bumblebee-1.3b (and maybe 6.7b for beefier machines)
I got a bit stuck last week trying to fine-tune the model but learned a lot about how the models actually work

Will hopefully have a model that is demonstrably better than bumblebee-1.3b by the end of the week

josevalim · 2023-11-27T09:02:29Z

Bumblebee or deepseekr? :)

MVP copilot completion using GPT4 API

c8c537b

Set OPENAI_API_KEY env var to play around with it

jonastemplestein changed the title ~~MVP copilot completion using GPT4 API~~ [WIP] Elixir code completion Nov 10, 2023

jonastemplestein added 7 commits November 10, 2023 22:16

Can't get bumblebee cuda to work on livebook machine

bd44d31

Reverting to GPT2

Fix typo

4a3ab78

Add static assets

bec936a

Temporarily default to codellama code completion in production

6e1ea76

Fix some client bugs and add deepseek model

644f414

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Elixir code completion #2332

[WIP] Elixir code completion #2332

jonastemplestein commented Nov 9, 2023 •

edited

CLAassistant commented Nov 9, 2023 •

edited

jonastemplestein commented Nov 9, 2023

josevalim commented Nov 9, 2023

jonastemplestein commented Nov 27, 2023 •

edited

josevalim commented Nov 27, 2023

[WIP] Elixir code completion #2332

Are you sure you want to change the base?

[WIP] Elixir code completion #2332

Conversation

jonastemplestein commented Nov 9, 2023 • edited

Status

TODO in Livebook

TODO for fine-tuning a model

CLAassistant commented Nov 9, 2023 • edited

jonastemplestein commented Nov 9, 2023

josevalim commented Nov 9, 2023

jonastemplestein commented Nov 27, 2023 • edited

josevalim commented Nov 27, 2023

jonastemplestein commented Nov 9, 2023 •

edited

CLAassistant commented Nov 9, 2023 •

edited

jonastemplestein commented Nov 27, 2023 •

edited