Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Starter Tutorial (Local Models) - model response incorrect #13542

Open
gmatteuc opened this issue May 16, 2024 · 6 comments
Open

[Bug]: Starter Tutorial (Local Models) - model response incorrect #13542

gmatteuc opened this issue May 16, 2024 · 6 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@gmatteuc
Copy link

gmatteuc commented May 16, 2024

Bug Description

I followed the installation and setup steps described in the documentation page. Everything seems to be correctly setup but when I run the starter tutorial code the model doesn't seem process correctly the document ("paul_graham_essay.txt") or the request because instead of responding "The author wrote short stories and tried to program on an IBM 1401." it responds: "Based on the context provided in the essay, the author did not directly mention what they did growing up. However, we can infer some information about their background from the text. [...]". From the logging it seems that the top 2 nodes retrieved are actually not the most relevant parts of the text to answer the question. What could the issue be?

Version

0.10.37

Steps to Reproduce

Installing and setting up as described here: https://docs.llamaindex.ai/en/stable/getting_started/installation/ and running the following code as "starter.py": "

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
documents = SimpleDirectoryReader("data").load_data()
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")
Settings.llm = Ollama(model="llama2", request_timeout=360.0)
index = VectorStoreIndex.from_documents(
documents,
)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

"

Relevant Logs/Tracbacks

PS C:\Users\matteucc\Desktop\Playground\LlamaIndex> & C:/Users/matteucc/AppData/Local/anaconda3/envs/llamaindex/python.exe c:/Users/matteucc/Desktop/Playground/LlamaIndex/starter.py
DEBUG:llama_index.core.readers.file.base:> [SimpleDirectoryReader] Total files added: 1
> [SimpleDirectoryReader] Total files added: 1
DEBUG:fsspec.local:open file: C:/Users/matteucc/Desktop/Playground/LlamaIndex/data/paul_graham_essay.txt
open file: C:/Users/matteucc/Desktop/Playground/LlamaIndex/data/paul_graham_essay.txt
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: What I Worked On

February 2021

Before college...
> Adding chunk: What I Worked On

February 2021

Before college...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: All that seemed left for philosophy were edge c...
> Adding chunk: All that seemed left for philosophy were edge c...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Its brokenness did, as so often happens, genera...
> Adding chunk: Its brokenness did, as so often happens, genera...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: If he even knew about the strange classes I was...
> Adding chunk: If he even knew about the strange classes I was...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: The students and faculty in the painting depart...
> Adding chunk: The students and faculty in the painting depart...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: I wanted to go back to RISD, but I was now brok...
> Adding chunk: I wanted to go back to RISD, but I was now brok...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: But alas it was more like the Accademia than no...
> Adding chunk: But alas it was more like the Accademia than no...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: After I moved to New York I became her de facto...
> Adding chunk: After I moved to New York I became her de facto...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Now we felt like we were really onto something....
> Adding chunk: Now we felt like we were really onto something....
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: In its time, the editor was one of the best gen...
> Adding chunk: In its time, the editor was one of the best gen...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: A company with just a handful of employees woul...
> Adding chunk: A company with just a handful of employees woul...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: I stuck it out for a few more months, then in d...
> Adding chunk: I stuck it out for a few more months, then in d...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: But about halfway through the summer I realized...
> Adding chunk: But about halfway through the summer I realized...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: One of the most conspicuous patterns I've notic...
> Adding chunk: One of the most conspicuous patterns I've notic...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Horrified at the prospect of having my inbox fl...
> Adding chunk: Horrified at the prospect of having my inbox fl...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: We'd use the building I owned in Cambridge as o...
> Adding chunk: We'd use the building I owned in Cambridge as o...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: It was originally meant to be a news aggregator...
> Adding chunk: It was originally meant to be a news aggregator...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: It had already eaten Arc, and was in the proces...
> Adding chunk: It had already eaten Arc, and was in the proces...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Then in March 2015 I started working on Lisp ag...
> Adding chunk: Then in March 2015 I started working on Lisp ag...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: I remember taking the boys to the coast on a su...
> Adding chunk: I remember taking the boys to the coast on a su...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: But when the software is an online store builde...
> Adding chunk: But when the software is an online store builde...
DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: [17] Another problem with HN was a bizarre edge...
> Adding chunk: [17] Another problem with HN was a bizarre edge...
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434
Starting new HTTP connection (1): localhost:11434
DEBUG:urllib3.connectionpool:http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
http://localhost:11434 "POST /api/embeddings HTTP/1.1" 200 None
DEBUG:llama_index.core.indices.utils:> Top 2 nodes:
> [Node 396b67ba-f079-4d38-8b1a-6b71f4d57b90] [Similarity score:             0.269713] Now we felt like we were really onto something. I had visions of a whole new generation of softwa...
> [Node 7df44d30-6d9b-4d53-a070-b9acc411adcd] [Similarity score:             0.26883] It was originally meant to be a news aggregator for startup founders and was called Startup News,...
> Top 2 nodes:
> [Node 396b67ba-f079-4d38-8b1a-6b71f4d57b90] [Similarity score:             0.269713] Now we felt like we were really onto something. I had visions of a whole new generation of softwa...
> [Node 7df44d30-6d9b-4d53-a070-b9acc411adcd] [Similarity score:             0.26883] It was originally meant to be a news aggregator for startup founders and was called Startup News,...
DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False
load_ssl_context verify=True cert=None trust_env=True http2=False
DEBUG:httpx:load_verify_locations cafile='C:\\Users\\matteucc\\AppData\\Local\\anaconda3\\envs\\llamaindex\\Library\\ssl\\cacert.pem'   
load_verify_locations cafile='C:\\Users\\matteucc\\AppData\\Local\\anaconda3\\envs\\llamaindex\\Library\\ssl\\cacert.pem'
DEBUG:httpcore.connection:connect_tcp.started host='localhost' port=11434 local_address=None timeout=360.0 socket_options=None
connect_tcp.started host='localhost' port=11434 local_address=None timeout=360.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x0000021E27884680>
connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x0000021E27884680>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/json; charset=utf-8'), (b'Date', b'Thu, 16 May 2024 20:17:14 GMT'), (b'Content-Length', b'986')])
receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/json; charset=utf-8'), (b'Date', b'Thu, 16 May 2024 20:17:14 GMT'), (b'Content-Length', b'986')])
INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
response_closed.started
DEBUG:httpcore.http11:response_closed.complete
response_closed.complete
DEBUG:httpcore.connection:close.started
close.started
DEBUG:httpcore.connection:close.complete
close.complete
Based on the context provided in the essay, the author did not directly mention what they did growing up. However, we can infer some information about their background from the text.

The author mentions that they studied art, which suggests that they may have had an interest in the arts from a young age. Additionally, the author's reference to Interleaf and their experience working as a consultant for the company implies that they may have had some exposure to programming or computer science during their formative years.

However, without any direct statements from the author about their childhood experiences or interests, we cannot provide a definitive answer to the query.
@gmatteuc gmatteuc added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 16, 2024
Copy link

dosubot bot commented May 16, 2024

The issue you're encountering with irrelevant sections of text being retrieved and responded to in the LlamaIndex library could be due to a few key factors related to the configuration of the CitationQueryEngine. Here are some targeted adjustments you can make to improve the relevance of the responses:

  1. Adjust Citation Chunk Size and Overlap: Fine-tune the citation_chunk_size and citation_chunk_overlap parameters. This controls the granularity of text chunks, affecting which sections are considered for answering queries.

  2. Experiment with Text Splitting Methods: The default SentenceSplitter might not always yield the most relevant text chunks. Trying different text splitting strategies or adjusting the existing parameters could lead to better outcomes.

  3. Review Node Postprocessors: Ensure that the node_postprocessors are correctly filtering and refining the nodes. This step is crucial for excluding irrelevant information from the response synthesis process.

  4. Optimize Metadata Mode: The metadata_mode parameter affects how context is provided to the LLM. Adjusting this could help in providing the right amount of context for more accurate responses.

  5. Configure Response Synthesizer Appropriately: The logic and configuration of the response_synthesizer play a significant role in generating the final response. Ensuring it's well-aligned with the type of queries can enhance the relevance of the answers.

By reviewing and potentially adjusting these aspects of the CitationQueryEngine, you should be able to improve the relevance of the responses generated by the LlamaIndex library for your queries.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@logan-markewich
Copy link
Collaborator

logan-markewich commented May 16, 2024

@gmatteuc it works fine for me, ran this a few times and the results were consistent

Code:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama

documents = SimpleDirectoryReader("data").load_data()

# nomic embedding model
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")

# ollama
Settings.llm = Ollama(model="llama3", request_timeout=360.0)

index = VectorStoreIndex.from_documents(
    documents,
)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
print(response.source_nodes[0].score)
print(response.source_nodes[1].score)
print(response.source_nodes[0].text[:100])
print(response.source_nodes[1].text[:100])

Output:

(llama-index-py3.11) python ./starter.py
Growing up, the author worked on writing short stories and programming outside of school. He wrote awful short stories with hardly any plot, just characters with strong feelings. In 9th grade, he tried to write programs on an IBM 1401 using early Fortran, but couldn't figure out what to do with it due to limited input options. Later, with the arrival of microcomputers, everything changed, and he started programming seriously, writing simple games, a program to predict rocket flight heights, and a word processor that his father used.
0.3468624729835544
0.26796958338700294
In the art world, money and coolness are tightly coupled. Anything expensive comes to be seen as coo
What I Worked On

February 2021

Before college the two main things I worked on, outside of school, 

Maybe make sure you have the latest version of Ollama's server installed, and the latest version of the models pulled?

ollama pull llama3
ollama pull nomic-embed-text

@gmatteuc
Copy link
Author

Yes, pulled llama3 and nomic-embed-text in Ollama (that I just reinstalled) as described.
Running your code snippet I get this:

"c:/Users/matteucc/Desktop/Playground/LlamaIndex/data/newtest.py
Based on the provided context information, there is no mention of the author's childhood
or upbringing. The text only discusses the author's experiences and decisions in his adult life, specifically related to his career and entrepreneurial ventures. Therefore, it is not possible to answer this query based on the given context.
0.26971341318867764
0.268829948067337
Now we felt like we were really onto something. I had visions of a whole new generation of software
It was originally meant to be a news aggregator for startup founders and was called Startup News, bu"

@logan-markewich
Copy link
Collaborator

That's pretty weird. I see you are on windows, i wonder if it's an Ollama + windows thing

@gmatteuc
Copy link
Author

Weird ideed. I will retry on another computer and/or using gpt-3.5-turbo to see if I get the same issue...

@gmatteuc
Copy link
Author

@logan-markewich Hi! Small update on the issue. I tried with gpt-3.5-turbo and it works. I also tried with LangChain + Ollama and got the same issue. I think the problem is not with the language model itself but with the similarity search in the vector store. In both case the retrieved documents don't always fit the query not all parts of the documents seems to be accessible for retrival... Indeed, with LangChain, if I swap the embedding generator from the Ollama one to the OpenAI one while keeping Ollama as LLM the probem is fully solved (I didn't try the same in LlamaIndex yet).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants