You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After building openllm to generate service and runner, then run the docker image as following:
Server:
$ docker run --rm --gpus all -p 3000:3000 -it mymodel-service:12345 start-runner-server --runner-name llm-mistral-runner
Starting RunnerServer from "/home/bentoml/bento" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
Starting RunnerServer from "/home/bentoml/bento" running on http://0.0.0.0:3000 (Press CTRL+C to quit)
.......
(RayWorkerVllm pid=559) INFO 02-13 19:58:42 model_runner.py:547] Graph capturing finished in 35 secs.
Client:
to send requests to the service: i got this error!!! Please could you help me in retrieving requests to the service!
$ curl -X 'POST' 'http://localhost:3000/generate_iterator' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"prompt": "Explain superconductors like Im five years old"}'
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 596, in __call__
await self.app(scope, otel_receive, otel_send)
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/instruments.py", line 252, in __call__
await self.app(scope, receive, wrapped_send)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/runner_app.py", line 295, in _request_handler
arg_num = int(request.headers["args-number"])
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/datastructures.py", line 565, in __getitem__
raise KeyError(key)
KeyError: 'args-number'
During handling of the above exception, another exception occurred:
+ Exception Group Traceback (most recent call last):
| File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
| await self.app(scope, receive, _send)
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 23, in __call__
| async with anyio.create_task_group():
| File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 678, in __aexit__
| raise BaseExceptionGroup(
| ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
| await self.app(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 596, in __call__
| await self.app(scope, otel_receive, otel_send)
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/http/instruments.py", line 252, in __call__
| await self.app(scope, receive, wrapped_send)
| File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
| await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
| raise exc
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
| await route.handle(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
| await self.app(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
| await wrap_app_handling_exceptions(app, request)(scope, receive, send)
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
| raise exc
| File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
| response = await func(request)
| ^^^^^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/site-packages/bentoml/_internal/server/runner_app.py", line 295, in _request_handler
| arg_num = int(request.headers["args-number"])
| ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
| File "/usr/local/lib/python3.11/site-packages/starlette/datastructures.py", line 565, in __getitem__
| raise KeyError(key)
| KeyError: 'args-number'
+------------------------------------
Describe the bug
After building openllm to generate service and runner, then run the docker image as following:
Server:
Client:
To reproduce
No response
Logs
No response
Environment
$ bentoml -v
bentoml, version 1.1.11
$openllm -v
openllm, 0.4.45.dev2 (compiled: False)
Python (CPython) 3.11.7
System information (Optional)
No response
The text was updated successfully, but these errors were encountered: