When the text gets longer, the embedding API does not seem to work. #19

hippoley · 2024-03-20T06:05:50Z

No description provided.

parthsarthi03 · 2024-03-21T14:18:17Z

Are you using the default OpenAI embeddings? Is there an error message or is the code stalling?

hippoley · 2024-03-22T01:32:35Z

I have noticed that you have already added retry decorators, but the 429 response is still being triggered. Only some open-source embedding methods seem to work.

@Retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def create_embedding(self, text):
text = text.replace("\n", " ")
return (
self.client.embeddings.create(input=[text], model=self.model)
.data[0]
.embedding
)

If I shortened the text, the embedding API would work. However, if the text gets longer, the following information must be shown：

2024-03-15 07:46:21,911 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,912 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,912 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,913 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,917 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,918 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,922 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,925 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,926 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,928 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,930 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,933 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,934 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"

parthsarthi03 · 2024-03-24T08:58:50Z

It seems like you're running into rate limiting issues due to OpenAI's API. You could request a rate limit increase from OpenAI if they allow it.

Also, currently, we utilize multithreading to build the leaf nodes. You can switch off multithreading, which will make it slower but should help avoid hitting the rate limits.

To make this change, update the following line in raptor/RetrievalAugmentation.py:

raptor/raptor/RetrievalAugmentation.py

Line 219 in 2e3e83e

self.tree = self.tree_builder.build_from_text(text=docs)

From:

self.tree = self.tree_builder.build_from_text(text=docs)

To:

self.tree = self.tree_builder.build_from_text(text=docs, use_multithreading=False)

hippoley · 2024-03-25T03:10:34Z

Thanks, Bro, Follow your instrcution, I changed the line.
But it still works in short contexts. When dealing with longer ones, timeouts still occur again.

cuichenxu · 2024-04-09T07:38:41Z

Strange! It works fine when I run the demo for the first time. But when I rerun the demo, an error occurred, the text in demo seems too long for raptor.

I can run it successfully after intercepting part of the content.

EmbedModel, QAModel and SummModel are all custom.

Follow this, the question are solved.

Update:
When switching off multithreading, building a tree from a story extracted from NarrativeQA datasets costs too long......
Is there any ideas to fix this problem? @parthsarthi03

parthsarthi03 added the bug Something isn't working label Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the text gets longer, the embedding API does not seem to work. #19

When the text gets longer, the embedding API does not seem to work. #19

hippoley commented Mar 20, 2024

parthsarthi03 commented Mar 21, 2024

hippoley commented Mar 22, 2024

parthsarthi03 commented Mar 24, 2024

hippoley commented Mar 25, 2024

cuichenxu commented Apr 9, 2024 •

edited

When the text gets longer, the embedding API does not seem to work. #19

When the text gets longer, the embedding API does not seem to work. #19

Comments

hippoley commented Mar 20, 2024

parthsarthi03 commented Mar 21, 2024

hippoley commented Mar 22, 2024

parthsarthi03 commented Mar 24, 2024

hippoley commented Mar 25, 2024

cuichenxu commented Apr 9, 2024 • edited

cuichenxu commented Apr 9, 2024 •

edited