Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream = False issue #147

Open
rvsh2 opened this issue May 1, 2024 · 5 comments
Open

Stream = False issue #147

rvsh2 opened this issue May 1, 2024 · 5 comments

Comments

@rvsh2
Copy link

rvsh2 commented May 1, 2024

Hello,
I run this code as an example:

chat.register(get_car_price)  # register this function
chat.register(get_top_stories)  # register this function
chat.register(what_time)
chat.register(get_current_weather,weather_parameters)

async def main():
	await chat.submit("What is the weather in San Francisco?")


# Call the async function
asyncio.run(main())

The result is streamed fine:

display_id='d6d40efa-b175-4b57-a24b-9a5efd736a7b' content='' finished=True has_displayed=False
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='' finished=False has_displayed=False
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco,' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sun' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and wind' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of ' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 7' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72 degrees' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72 degrees F' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72 degrees Fahren' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72 degrees Fahrenheit' finished=False has_displayed=True
display_id='16450bdf-0ec4-42c2-b93f-ccf4e930c607' content='The weather in San Francisco, CA is currently sunny and windy with a temperature of 72 degrees Fahrenheit.' finished=False has_displayed=True

BUT if I run with this change:
await chat.submit("What is the weather in San Francisco?",stream=False)

I got errors:

Traceback (most recent call last):
  File "D:\!Programs\llm-with-functionary\main.py", line 102, in <module>
    asyncio.run(main())
  File "C:\Users\krist\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "C:\Users\krist\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\krist\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\main.py", line 98, in main
    await chat.submit("What is the weather in San Francisco?",stream=False)
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\chatlab\chat.py", line 356, in submit
    await self.submit(stream=stream, **kwargs)
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\chatlab\chat.py", line 313, in submit
    full_response = await client.chat.completions.create(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\resources\chat\completions.py", line 1159, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1790, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1493, in request
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1569, in _request
    return await self._retry_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1615, in _retry_request
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1569, in _request
    return await self._retry_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1615, in _retry_request
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "D:\!Programs\llm-with-functionary\venv\Lib\site-packages\openai\_base_client.py", line 1584, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Internal Server Error

Is this an issue or I am doing something wrong?

@rvsh2
Copy link
Author

rvsh2 commented May 1, 2024

I've modified the code in chat.py to show the messages generated in those two cases:

            if stream:
                print(chat_create_kwargs["messages"])
                streaming_response = await client.chat.completions.create(
                    **chat_create_kwargs,
                    stream=True,
                )
                self.append(*messages)

                finish_reason, function_call_request, tool_arguments = await self.__process_stream(streaming_response)
            else:
                print(chat_create_kwargs["messages"])
                full_response = await client.chat.completions.create(
                    **chat_create_kwargs,
                    stream=False,
                )

I've got those results:
stream = False

[{'role': 'user', 'content': 'What time is it in your timezone?'}]
display_id='b1e4e516-a85f-4093-bd0c-62dbb6aa268c' content='' finished=True has_displayed=False
None
[{'role': 'user', 'content': 'What time is it in your timezone?'}, {'role': 'assistant', 'tool_calls': [{'id': 'call_cAPdStYy6dMXYTkw617eAdCw', 'function': {'name': 'what_time', 'arguments': '{}'}, 'type': 'function'}]}, {'role': 'tool', 'name': 'what_time', 'content': '22:30', 'tool_call_id': 'call_cAPdStYy6dMXYTkw617eAdCw'}]

stream = true

{'role': 'user', 'content': 'What time is it in your timezone?'}]
None
[{'role': 'user', 'content': 'What time is it in your timezone?'}, {'content': None, 'role': 'assistant', 'function_call': None, 'tool_calls': [{'id': 'call_M5NWqRtK2ZDAlbDZqm8yewgh', 'function': {'arguments': '{}', 'name': 'what_time'}, 'type': 'function', 'index': None}], 'tool_call_id': None, 'name': None}, {'role': 'assistant', 'tool_calls': [{'id': 'call_M5NWqRtK2ZDAlbDZqm8yewgh', 'function': {'name': 'what_time', 'arguments': '{}'}, 'type': 'function'}]}, {'role': 'tool', 'name': 'what_time', 'content': '22:33', 'tool_call_id': 'call_M5NWqRtK2ZDAlbDZqm8yewgh'}]

I'm using functionary-small-v2.4 as a model with vllm.

Can anyone help?

@rvsh2
Copy link
Author

rvsh2 commented May 1, 2024

vllm gives this output:

functionary                    | Future exception was never retrieved
functionary                    | future: <Future finished exception=TypeError("'NoneType' object is not subscriptable")>
functionary                    | Traceback (most recent call last):
functionary                    |   File "/workspace/functionary/functionary/vllm_monkey_patch/async_llm_engine.py", line 42, in _raise_exception_on_finish
functionary                    |     task.result()
functionary                    |   File "/workspace/functionary/functionary/vllm_monkey_patch/async_llm_engine.py", line 441, in run_engine_loop
functionary                    |     has_requests_in_progress = await self.engine_step()
functionary                    |   File "/workspace/functionary/functionary/vllm_monkey_patch/async_llm_engine.py", line 419, in engine_step
functionary                    |     request_outputs = await self.engine.step_async()
functionary                    |   File "/workspace/functionary/functionary/vllm_monkey_patch/async_llm_engine.py", line 265, in step_async
functionary                    |     ) = prompt_template.grammar_sample(
functionary                    |   File "/workspace/functionary/functionary/prompt_template/base_template.py", line 297, in grammar_sample
functionary                    |     options = [tool_or_func["name"] for tool_or_func in tools_or_functions]
functionary                    |   File "/workspace/functionary/functionary/prompt_template/base_template.py", line 297, in <listcomp>
functionary                    |     options = [tool_or_func["name"] for tool_or_func in tools_or_functions]
functionary                    | TypeError: 'NoneType' object is not subscriptable

@rvsh2
Copy link
Author

rvsh2 commented May 1, 2024

If i disable grammar sampling I've got this in vllm:

functionary                    | ERROR:    Exception in ASGI application
functionary                    | Traceback (most recent call last):
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
functionary                    |     result = await app(  # type: ignore[func-returns-value]
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
functionary                    |     return await self.app(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1106, in __call__
functionary                    |     await super().__call__(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
functionary                    |     await self.middleware_stack(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 184, in __call__
functionary                    |     raise exc
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 162, in __call__
functionary                    |     await self.app(scope, receive, _send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 83, in __call__
functionary                    |     await self.app(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
functionary                    |     raise exc
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
functionary                    |     await self.app(scope, receive, sender)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
functionary                    |     raise e
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
functionary                    |     await self.app(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
functionary                    |     await route.handle(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 276, in handle
functionary                    |     await self.app(scope, receive, send)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 66, in app
functionary                    |     response = await func(request)
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 274, in app
functionary                    |     raw_response = await run_endpoint_function(
functionary                    |   File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
functionary                    |     return await dependant.call(**values)
functionary                    |   File "/workspace/functionary/server_vllm.py", line 257, in create_chat_completion
functionary                    |     prompt_token_ids = prepare_messages_for_inference(
functionary                    |   File "/workspace/functionary/functionary/inference.py", line 59, in prepare_messages_for_inference
functionary                    |     dic_messages = prompt_template.pre_process_messages_before_inference(dic_messages)
functionary                    |   File "/workspace/functionary/functionary/prompt_template/prompt_template_v2.py", line 202, in pre_process_messages_before_inference
functionary                    |     new_messages = [id_2_tool_messages[cid] for cid in tool_call_ids]
functionary                    |   File "/workspace/functionary/functionary/prompt_template/prompt_template_v2.py", line 202, in <listcomp>
functionary                    |     new_messages = [id_2_tool_messages[cid] for cid in tool_call_ids]
functionary                    | KeyError: 'call_2yW4Acq9GFz6Y1t9EwL56nGi'

@rvsh2
Copy link
Author

rvsh2 commented May 2, 2024

Hi,

I manage to solve the issue but I don't know why it works as it should.
In chat.py I replaced lines in submit function to:

        if finish_reason == "tool_calls" and tool_arguments:
            assistant_tool_calls(tool_arguments)

I found that model is always returning finish_reason = "tool_calls" if there was tool calling even in response with content.
But the response had always tool_arguments = [].
That's why I added and tool_arguements

Without this change the inference never stops.

The second change I remove append because it was appending the same tool data to the message. The message already had the tool data in it.
I compared messages with stream = False and True and found that this additional info was a problem with crashing vllm_server.py.

As you can see in the example that the tool call id is present two times when using chat with stream=False option.

I don't know if this behaviour is only connected with functionary-v2.4 model because I didn't test any other model with this.

STREAM = FALSE

messages:  
[
  {'role': 'user', 'content': 'What time is it in your timezone?'},
  {'content': None, 
   'role': 'assistant', 'function_call': None, 
    'tool_calls': 
      [
        {'id': 'call_FqTEnvrccdkwasPaieYBRoMz', 'function': {'arguments': '{}', 'name': 'what_time'}, 'type': 'function', 'index': None}
	  ], 'tool_call_id': None, 'name': None
  }, 
  {'role': 'assistant', 
    'tool_calls': 
      [
        {'id': 'call_FqTEnvrccdkwasPaieYBRoMz', 'function': {'name': 'what_time', 'arguments': '{}'}, 'type': 'function'}
      ]
  }
]

STREAM = TRUE

messages:  
[
  {'role': 'user', 'content': 'What time is it in your timezone?'}, 
  {'role': 'assistant', 
    'tool_calls': 
	  [
	    {'id': 'call_J6oDmvMgM4fYuuID5uqbHcmx', 'function': {'name': 'what_time', 'arguments': '{}'}, 'type': 'function'}
	  ]
  }
]

Hopefully this solve the issue. Can you comment?

@rgbkrk
Copy link
Owner

rgbkrk commented May 9, 2024

Interesting, thank you. I'll have to dig in further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants