Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Intermittent "ValueError: Failed to convert output to JSON:" with AzureOpenAI and RouterQueryEngine #13562

Open
NGeorgescu opened this issue May 17, 2024 · 3 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@NGeorgescu
Copy link

Bug Description

It intermittently fails to produce an output and errors out. So in one case, the result is the output printed to terminal:

Gatsby and Daisy share a complex and complicated relationship. Gatsby is deeply in love with Daisy and has been for many years. He even bought a house across the bay from her just to be close to her and throws extravagant parties hoping she might attend. He is also overwhelmed by her wealth and the youth and mystery it brings. Gatsby desires Daisy to admit that she never loved her husband, Tom, and only loves him. He even dreams of them getting married in her house in Louisville, like it was five years ago. However, Daisy is unable to do this. She is married to Tom and seems to be caught up in her rich, full life. They have secret meetings, but Daisy doesn't understand Gatsby's feelings and doesn't see why he can't come to her. Gatsby feels distant from her and believes she didn't enjoy his party. He is determined to fix everything and make it just like it was before. However, their relationship ends tragically.

But the previous run I got:

File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/output_parsers/selection.py:99 in parse
 raise ValueError(f"Failed to convert output to JSON: {output!r}")

ValueError: Failed to convert output to JSON: 'The provided choices do not provide any specific information about the interactions between Gatsby and Daisy. Therefore, none of the choices are relevant to the question.'

Prior to that:


  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/output_parsers/selection.py:99 in parse
    raise ValueError(f"Failed to convert output to JSON: {output!r}")

ValueError: Failed to convert output to JSON: '[\n]'

Prior to that:

Gatsby and Daisy have a complex and complicated relationship. Gatsby is deeply in love with Daisy and has been for many years. He bought a house across the bay from her to be near her and throws extravagant parties hoping she might attend. Gatsby desires Daisy to admit that she never loved her husband, Tom, and to leave him. He even wishes for them to return to Louisville and get married as if it were five years ago. However, Daisy is married to Tom Buchanan and has a daughter. She does have feelings for Gatsby and they have an affair, but she is unable to leave Tom or admit she never loved him. Gatsby also feels a sense of distance and disconnect from Daisy, and is aware of her wealth. Their relationship ends tragically when Daisy accidentally kills Myrtle Wilson, Tom's mistress, with Gatsby's car. Gatsby takes the blame for the accident, which leads to his murder by Myrtle's husband.

So it fails intermittently which makes running eval sets impossible.

Version

0.10.34

Steps to Reproduce

llama-index==0.10.34,llama-index-llms-azure-openai==0.1.8, Python 3.12.3 on archlinux, virtualenv in ~/.virtualenvs/math/

Got the exact code from this page. The only modifications were as follows:

The code block with the definition of the llm was edited to read

azure_params = {'api_key': "[redacted]", 'azure_endpoint': "https://[redacted].openai.azure.com/",, 'api_version': "2023-07-01-preview"}

from llama_index.core import Settings
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding

llm = AzureOpenAI(model="gpt-4-32k", deployment_name='[redacted]', **azure_params)
embed_model = AzureOpenAIEmbedding(model="text-embedding-ada-002", deployment_name="[redacted]", **azure_params)
Settings.llm = llm
Settings.embed_model = embed_model   


Settings.chunk_size = 1024
nodes = Settings.node_parser.get_nodes_from_documents(documents)

and the final code block was edited to change the await to an asyncio.run():

import asyncio
response = asyncio.run(query_engine.aquery(
    "Describe and summarize the interactions between Gatsby and Daisy"
))
print(response)

Relevant Logs/Tracbacks

So running in spyder copying from the console, I got the following error, with full traceback:



query_engine = RouterQueryEngine(
    selector=LLMMultiSelector.from_defaults(),
    query_engine_tools=[
        keyword_tool,
        vector_tool,
    ],
    summarizer=tree_summarize,
)

import asyncio
response = asyncio.run(query_engine.aquery(
    "Describe and summarize the interactions between Gatsby and Daisy"
))
print(response)
Traceback (most recent call last):

  Cell In[13], line 11
    response = asyncio.run(query_engine.aquery(

  File ~/.virtualenvs/math/lib/python3.12/site-packages/nest_asyncio.py:30 in run
    return loop.run_until_complete(task)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/nest_asyncio.py:98 in run_until_complete
    return f.result()

  File /usr/lib/python3.12/asyncio/futures.py:203 in result
    raise self._exception.with_traceback(self._exception_tb)

  File /usr/lib/python3.12/asyncio/tasks.py:314 in __step_run_and_handle_result
    result = coro.send(None)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/instrumentation/dispatcher.py:307 in async_wrapper
    result = await func(*args, **kwargs)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/base/base_query_engine.py:65 in aquery
    query_result = await self._aquery(str_or_query_bundle)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/query_engine/router_query_engine.py:214 in _aquery
    result = await self._selector.aselect(self._metadatas, query_bundle)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/base/base_selector.py:94 in aselect
    return await self._aselect(choices=metadatas, query=query_bundle)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/selectors/llm_selectors.py:234 in _aselect
    parsed = self._prompt.output_parser.parse(prediction)

  File ~/.virtualenvs/math/lib/python3.12/site-packages/llama_index/core/output_parsers/selection.py:99 in parse
    raise ValueError(f"Failed to convert output to JSON: {output!r}")

ValueError: Failed to convert output to JSON: '[\n]'
@NGeorgescu NGeorgescu added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 17, 2024
Copy link

dosubot bot commented May 17, 2024

The intermittent ValueError: Failed to convert output to JSON: errors you're encountering could be due to a few key issues:

  1. Non-JSON-serializable objects in the response: The error might occur if the output includes types that cannot be directly serialized to JSON. Ensure all objects in the response are of types that JSON understands (e.g., dict, list, str, int, float, bool, None).

  2. Concurrency issues: Since your project uses asynchronous operations, there might be concurrency issues leading to inconsistent states that affect serialization. Ensure that data being serialized is not being modified concurrently by multiple tasks.

  3. Data corruption or unexpected formats: If the returned data is corrupted or not in the expected format, serialization might fail. Verify the integrity and format of your data sources and the queries that produce the output.

  4. Improper handling of special characters: Ensure that special characters, especially in dynamic content, are correctly escaped to prevent serialization errors.

  5. Fallback to YAML parsing due to JSON parsing failure: The system might attempt to parse the output as YAML if JSON parsing fails, indicating that the output might sometimes not be strictly JSON-compliant. Check for syntax issues like trailing commas that JSON does not tolerate.

To address these issues, review your data handling and serialization logic, ensure that all objects are serializable, and handle concurrency carefully to avoid race conditions. Additionally, validate the format of the output before attempting to serialize it to JSON.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@logan-markewich
Copy link
Collaborator

As you can see, the LLM did not output a choice for the router query engine. (it seems to think there is no valid choice)

The option here would be either wrapping your query in a try/except, or possibly letting that response return to the user (although that might be confusing for users too?)

@NGeorgescu
Copy link
Author

As you can see, the LLM did not output a choice for the router query engine. (it seems to think there is no valid choice)

The option here would be either wrapping your query in a try/except, or possibly letting that response return to the user (although that might be confusing for users too?)

yeah the problem is that for instance if you want to do an evaluator it will just error out. I guess there's probably a workaround?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants