Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Use local LLM deployment as Judge #694

Open
pascal-pfeiffer opened this issue May 3, 2024 · 1 comment
Open

[FEATURE] Use local LLM deployment as Judge #694

pascal-pfeiffer opened this issue May 3, 2024 · 1 comment
Assignees
Labels
type/feature Feature request

Comments

@pascal-pfeiffer
Copy link
Collaborator

馃殌 Feature

Additional to the hardcoded models, add support to use local models as judges for evaluation. Can be simplified to require the OpenAI API.
Should be basically an endpoint selection, probably the Azure hosted pipeline can be extended to cover this. If already working, add documentation on how this can be done.

Motivation

Local development and evals

@pascal-pfeiffer pascal-pfeiffer added the type/feature Feature request label May 3, 2024
@pascal-pfeiffer
Copy link
Collaborator Author

One way this is already supported in the current version:

  1. Have an endpoint running that supports the OpenAI API format, specifically chat.completions.

  2. Start LLM Studio with environment variable to point to that endpoint: OPENAI_API_BASE="http://111.111.111.111:8000/v1"

  3. Validate correct usage in "Settings page". Note that "Use Azure" must be off, and the environment variable that was set above should be visible below. Changing it here has no effect! This is only for testing the correct setting of the environment variable.
    image

  4. Run an experiment with GPT metric and use the correct model name at your endpoint:
    image

  5. Calls to the LLM judge are now directed to your own LLM endpoint
    image

@pascal-pfeiffer pascal-pfeiffer self-assigned this May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Feature request
Projects
None yet
Development

No branches or pull requests

1 participant