ExpressOpenAIChatProxy

This project is based on the code From Pawan Osman here. The original license is copied below. https://github.com/PawanOsman/ChatGPT

Acts as a proxy for OpenAI HTTP calls (and the python library) and directs to a set of Azure OpenAI Keys.

Once launched the proxy is self-documenting, including a python notebook for google colab. see / or /docs/ once the server is started.

The proxy is bearer token protected. instructors get the current token from the /status page

This Proxy

only currently works for the /chat/completion part of OpenAI
only currently tested with ChatGPT 3.5 turbo
lets you specify multiple Azure keys
randomly load balances all requests across those keys
creates a semaphore for each key, so they only see one concurrent use
caches responses in-memory for the 100 most recent calls
timeout logic for when Azure never response ... which happens
configurable concurrency for the semaphore

TODO - not yet implemented

more thorough testing for streaming responses
some internal backoff if all semaphores are taken
test and make work with elastic observability assistant
make work with huggingface inference

First add your keys and settings

Add your Azure keys to config.js

Adjust the settings to your liking and change the admin password

To run locally

setup a .env file that looks like this

export BASE_URL=http://localhost:3000
export ELASTIC_APM_SERVICE_NAME=local-llm-proxy
export ELASTIC_APM_SECRET_TOKEN=<your key>
export ELASTIC_APM_SERVER_URL=<your apm server>
export ADMIN_PASSWORD=<your admin password for status page>
export SALT=<your salt>

npm install

and then

bash runLocal.sh

To run in docker

add your APM settings to runDocker.sh

bash build.sh

and then

bash runDocker.sh

Deployment to Google Cloud Run

To deploy this service to Google Cloud Run (production), you will need gcloud tooling, Docker, and sed (likely already available from your OS). This deployment script requires that the following environmental variables be set in a local file (not checked in) called provision.env:

GCP_PROJECT_ID=
GCP_REGION=
GCP_LABELS_DIVISION=
GCP_LABELS_ORG=
GCP_LABELS_TEAM=

BASE_URL=

For Elasticians, you can download the latest set of variables used for deployment from Google Secrets via:

gcloud secrets versions access latest --secret=llm_proxy_provision > provision.env

From there, execute ./deploy.sh to build the dockerfile locally, upload it to Google's container registry, and then deploy or re-deploy the Cloud Run instance.

./deploy.sh

to test

use the api by sending queries to

http://localhost:3000/v1/chat/completions

here's a curl example

curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer fake" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

here is a python example

import openai

proxy = "http://localhost:3000/v1"

openai.default_model = "gpt-3.5-turbo"
openai.api_key = "not-real" ## you have to submit something
openai.api_base = proxy


try:

    prompt = "hello"  # Replace this with your actual prompt
    completion = openai.ChatCompletion.create(
        model=openai.default_model,
        messages=[
            {"role": "system", "content": "you are a pirate"},
            {"role": "user", "content": prompt}
        ],
        temperature=0.7
    )
    print(completion)

except openai.error.OpenAIError as e:
    # If the error is from the OpenAI API, you can print the response details
    print("An OpenAI API error occurred:")
    print("Status code:", e.http_status)
    print("Error message:", e.message)
    print("Request ID:", e.request_id)
    print("Error details:", e.response_data)
except Exception as e:
    # Handle other unexpected exceptions
    print("An error occurred:", str(e))

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
Notebooks		Notebooks
deploy		deploy
static		static
stresstest		stresstest
views		views
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
buid.sh		buid.sh
config.js		config.js
deploy.sh		deploy.sh
functions.js		functions.js
index.js		index.js
middlewares.js		middlewares.js
package-lock.json		package-lock.json
package.json		package.json
routes.js		routes.js
runDocker.sh		runDocker.sh
runLocal.sh		runLocal.sh
secrets.sh		secrets.sh
test.js		test.js
testCurls.sh		testCurls.sh
userAuth.js		userAuth.js

License

derickson/ExpressOpenAIChatProxy

Folders and files

Latest commit

History

Repository files navigation

ExpressOpenAIChatProxy

First add your keys and settings

To run locally

To run in docker

Deployment to Google Cloud Run

to test

About

Resources

License

Stars

Watchers

Forks

Languages