Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entrypoint for hosting local Kobold Lite chat interface #184

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented Apr 12, 2024

Adds a new vllm.entrypoints.kobold.api_server interface that inherits from the openai interface, utilizing Kobold Lite from https://github.com/LostRuins/koboldcpp/blob/concedo/klite.embd for a standalone LLM WebUI using a single static HTML file.

This PR is much smaller than it looks and only has about ~400 LOC. 15342 lines are just from the standalone HTML file.

python -m vllm.entrypoints.kobold.api_server --model nm-testing/OpenHermes-2.5-Mistral-7B-pruned50 --sparsity sparse_w16a16

Screenshot 2024-04-12 152605

image

@robertgshaw2-neuralmagic
Copy link
Collaborator

Worth pushing upstream or no?

@mgoin
Copy link
Member Author

mgoin commented Apr 12, 2024

Worth pushing upstream or no?

@robertgshaw2-neuralmagic I think it would be nice to have upstream! I just assumed this kind of thing wouldn't be accepted - it might require some refactoring to make it solid on-top of the openai endpoint

@robertgshaw2-neuralmagic
Copy link
Collaborator

Worth pushing upstream or no?

@robertgshaw2-neuralmagic I think it would be nice to have upstream! I just assumed this kind of thing wouldn't be accepted - it might require some refactoring to make it solid on-top of the openai endpoint

I don't think it would hurt to try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants