-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Streaming with async_response_gen incompatible with FastAPI #13495
Comments
The error you're encountering, To address this issue, ensure that all asynchronous components and tasks are initialized and awaited within the same event loop context. This can be achieved by:
Make sure to update your libraries to the latest versions, as fixes in newer versions might resolve your issue. Additionally, consider adding logging to track event loop usage throughout your application, which can help identify where the loop mismatch occurs. Regarding the If the issue persists, reviewing the implementation of the This approach should help resolve the asyncio error you're facing by ensuring consistent event loop usage across your FastAPI application and its asynchronous operations.
|
@JLongley can you try updating? v0.10.37 had some updates that maybe fixed this? This script worked great for me on each request using the latest version of llama-index-core/llama-index
|
Thanks Logan, I've upgraded fastapi and llama-index both to the latest versions, but I still am seeing the same errors. I notice that about 1 in every ~5 times the request will get through without an error, but interestingly, the response returns all at once in postman, not one token at a time. llama-index==0.10.37
Error output:
|
I was running in browser (lol), and the streaming seemed to work fine. Let me see if I can reproduce with the above code |
Seems like the queue maybe needs to be initialized each time to be in the current async loop? |
@JLongley Hmm, I still can't reproduce Maybe try with a fresh venv to be sure? I copied your script above exactly and just launched, and then used postman this time. Zero requests failed 🤔 |
Bug Description
I have a very simple FastAPI endpoint set up to test out streaming tokens back from a context chat engine. As written, the first request correctly streams the content back, but every subsequent request gives me an asyncio error:
The full stack trace is linked below.
Version
llama-index==0.10.36, fastapi==0.104.1
Steps to Reproduce
I'm running the above code in a docker container.
With that setup, I cURL
http://localhost:8000/copilot/stream_test?message=Hello
and get a streamed response. If I cURL the endpoint a second time, I get no response and the stack trace above is output by the server.Here is my implementation:
Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: