Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openai_api连接本地部署的ollama模型出错 #4036

Open
liaoweiguo opened this issue May 16, 2024 · 3 comments
Open

openai_api连接本地部署的ollama模型出错 #4036

liaoweiguo opened this issue May 16, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@liaoweiguo
Copy link

问题描述 / Problem Description
用简洁明了的语言描述这个问题 / Describe the problem in a clear and concise manner.

本地部署的ollama qwen:32b,部署端口为 11434
"openai-api": {
"model_name": 'qwen:32b',
"api_base_url": "http://localhost:11443/v1",
"api_key": "sk-xxxx",
},

复现问题的步骤 / Steps to Reproduce

  1. python startup.py -a
    输出:

==============================Langchain-Chatchat Configuration==============================
操作系统:Linux-5.15.0-105-generic-x86_64-with-glibc2.35.
python版本:3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
项目版本:v0.2.10
langchain版本:0.0.354. fastchat版本:0.2.35

当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['chatglm3-6b', 'openai-api', 'zhipu-api'] @ cuda
{'device': 'cuda',
'host': '0.0.0.0',
'infer_turbo': False,
'model_path': 'chatglm3-6b',
'model_path_exists': True,
'port': 20002}
{'api_base_url': 'http://localhost:11443/v1',
'api_key': 'sk-',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'model_name': 'qwen:32b',
'online_api': True,
'port': 20002}
{'api_key': '',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'online_api': True,
'port': 21001,
'provider': 'ChatGLMWorker',
'version': 'glm-4',
'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: bge-large-zh-v1.5 @ cuda
==============================Langchain-Chatchat Configuration==============================

2024-05-16 21:01:59,787 - startup.py[line:655] - INFO: 正在启动服务:
2024-05-16 21:01:59,787 - startup.py[line:656] - INFO: 如需查看 llm_api 日志,请前往 /home/liao/exgit/Langchain-Chatchat/logs
/home/liao/exgit/Langchain-Chatchat/lang/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: 模型启动功能将于 Langchain-Chatchat 0.3.x重写,支持更多模式和加速启动,0.2.x中相关功能将废弃
warn_deprecated(
2024-05-16 21:02:02 | INFO | model_worker | Register to controller
2024-05-16 21:02:02 | ERROR | stderr | INFO: Started server process [3474493]
2024-05-16 21:02:02 | ERROR | stderr | INFO: Waiting for application startup.
2024-05-16 21:02:02 | ERROR | stderr | INFO: Application startup complete.
2024-05-16 21:02:02 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20000 (Press CTRL+C to quit)
2024-05-16 21:02:02 | INFO | model_worker | Loading the model ['chatglm3-6b'] on worker a1571fe4 ...
Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s]
2024-05-16 21:02:02 | ERROR | stderr | /home/liao/.local/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
2024-05-16 21:02:02 | ERROR | stderr | return self.fget.get(instance, owner)()
Loading checkpoint shards: 14%|████▊ | 1/7 [00:00<00:01, 5.89it/s]
Loading checkpoint shards: 29%|█████████▋ | 2/7 [00:00<00:00, 5.89it/s]
Loading checkpoint shards: 43%|██████████████▌ | 3/7 [00:00<00:00, 5.91it/s]
Loading checkpoint shards: 57%|███████████████████▍ | 4/7 [00:00<00:00, 6.01it/s]
Loading checkpoint shards: 71%|████████████████████████▎ | 5/7 [00:00<00:00, 5.94it/s]
Loading checkpoint shards: 86%|█████████████████████████████▏ | 6/7 [00:01<00:00, 5.90it/s]
Loading checkpoint shards: 100%|██████████████████████████████████| 7/7 [00:01<00:00, 6.54it/s]
Loading checkpoint shards: 100%|██████████████████████████████████| 7/7 [00:01<00:00, 6.18it/s]
2024-05-16 21:02:03 | ERROR | stderr |
2024-05-16 21:02:04 | INFO | model_worker | Register to controller
INFO: Started server process [3474742]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7861 (Press CTRL+C to quit)

==============================Langchain-Chatchat Configuration==============================
操作系统:Linux-5.15.0-105-generic-x86_64-with-glibc2.35.
python版本:3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
项目版本:v0.2.10
langchain版本:0.0.354. fastchat版本:0.2.35

当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['chatglm3-6b', 'openai-api', 'zhipu-api'] @ cuda
{'device': 'cuda',
'host': '0.0.0.0',
'infer_turbo': False,
'model_path': 'chatglm3-6b',
'model_path_exists': True,
'port': 20002}
{'api_base_url': 'http://localhost:11443/v1',
'api_key': 'sk-f768e176302549058cf970ce9f297aa5',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'model_name': 'qwen:32b',
'online_api': True,
'port': 20002}
{'api_key': '',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'online_api': True,
'port': 21001,
'provider': 'ChatGLMWorker',
'version': 'glm-4',
'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: bge-large-zh-v1.5 @ cuda

服务端运行信息:
OpenAI API Server: http://127.0.0.1:20000/v1
Chatchat API Server: http://127.0.0.1:7861
Chatchat WEBUI Server: http://0.0.0.0:7860
==============================Langchain-Chatchat Configuration==============================

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.

You can now view your Streamlit app in your browser.

URL: http://0.0.0.0:7860

2024-05-16 21:14:58,849 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:43496 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:14:58,850 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
2024-05-16 21:14:58,990 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:43496 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:14:58,991 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:43496 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK
2024-05-16 21:14:58,994 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"

  1. 浏览器Safari打开 http://0.0.0.0:7860,选择LLM模型 openai-api,输入“hello”,回车,出错

INFO: 127.0.0.1:35520 - "POST /llm_model/get_model_config HTTP/1.1" 200 OK
2024-05-16 21:19:02,110 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/get_model_config "HTTP/1.1 200 OK"
2024-05-16 21:19:02,197 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,198 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
2024-05-16 21:19:02,224 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,225 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,230 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"
2024-05-16 21:19:08,080 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,081 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
2024-05-16 21:19:08,107 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,108 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,111 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /chat/chat HTTP/1.1" 200 OK
/home/liao/exgit/Langchain-Chatchat/lang/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The class ChatOpenAI was deprecated in LangChain 0.0.10 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run pip install -U langchain-openai and import as from langchain_openai import ChatOpenAI.
warn_deprecated(
2024-05-16 21:19:08,304 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/chat/chat "HTTP/1.1 200 OK"
2024-05-16 21:19:08,314 - _base_client.py[line:1611] - INFO: Retrying request to /chat/completions in 0.779848 seconds
2024-05-16 21:19:09,097 - _base_client.py[line:1611] - INFO: Retrying request to /chat/completions in 1.693247 seconds
2024-05-16 21:19:10,794 - utils.py[line:38] - ERROR: Connection error.

环境信息 / Environment Information

最新版本,非docker,ollama和langchain_chatchat部署在ubuntu 2204,4090x2

ollama上的openai-api正常

浏览器为Mac safari

另外,为什么智谱的模型和API是running,其他不是,在哪配置的吗,没找到

@liaoweiguo liaoweiguo added the bug Something isn't working label May 16, 2024
Copy link

dosubot bot commented May 16, 2024

你好,@liaoweiguo!👋 我是Dosu,一个可以帮助你解决bug、回答问题并成为贡献者的聊天机器人。在等待人类维护者的同时,我在这里帮助你。我现在就来看看你的问题,并将很快提供完整的答案。🚀

@liaoweiguo
Copy link
Author

image

@hanwsf
Copy link

hanwsf commented May 17, 2024

ONLINE_LLM_MODEL = {
"dolphin-llama3": { #ollama
"api_base_url": "http://localhost:11434/v1",
"options": 'num_predict 4098'#传递给ollama模型,是PARAMETER参数
},
"qwen:7b-chat-v1.5-q6_K": { #ollama 85% w noise
"api_base_url": "http://localhost:11434/v1",
"options": 'num_predict 4098'#传递给ollama模型,是PARAMETER参数

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants