Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the text gets longer, the embedding API does not seem to work. #19

Open
hippoley opened this issue Mar 20, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@hippoley
Copy link

No description provided.

@parthsarthi03
Copy link
Owner

Are you using the default OpenAI embeddings? Is there an error message or is the code stalling?

@parthsarthi03 parthsarthi03 added the bug Something isn't working label Mar 21, 2024
@hippoley
Copy link
Author

I have noticed that you have already added retry decorators, but the 429 response is still being triggered. Only some open-source embedding methods seem to work.

@Retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def create_embedding(self, text):
text = text.replace("\n", " ")
return (
self.client.embeddings.create(input=[text], model=self.model)
.data[0]
.embedding
)

If I shortened the text, the embedding API would work. However, if the text gets longer, the following information must be shown:

2024-03-15 07:46:21,911 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,912 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,912 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,913 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,917 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,918 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,922 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,925 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,926 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,928 - Retrying request to /embeddings in 20.000000 seconds
2024-03-15 07:46:21,930 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,933 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"
2024-03-15 07:46:21,934 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"

@parthsarthi03
Copy link
Owner

It seems like you're running into rate limiting issues due to OpenAI's API. You could request a rate limit increase from OpenAI if they allow it.

Also, currently, we utilize multithreading to build the leaf nodes. You can switch off multithreading, which will make it slower but should help avoid hitting the rate limits.

To make this change, update the following line in raptor/RetrievalAugmentation.py:

self.tree = self.tree_builder.build_from_text(text=docs)

From:

self.tree = self.tree_builder.build_from_text(text=docs)

To:

self.tree = self.tree_builder.build_from_text(text=docs, use_multithreading=False)

@hippoley
Copy link
Author

image

image

Thanks, Bro, Follow your instrcution, I changed the line.
But it still works in short contexts. When dealing with longer ones, timeouts still occur again.

@cuichenxu
Copy link

cuichenxu commented Apr 9, 2024

Strange! It works fine when I run the demo for the first time. But when I rerun the demo, an error occurred, the text in demo seems too long for raptor.
image
I can run it successfully after intercepting part of the content.

EmbedModel, QAModel and SummModel are all custom.

Follow this, the question are solved.

Update:
When switching off multithreading, building a tree from a story extracted from NarrativeQA datasets costs too long......
Is there any ideas to fix this problem? @parthsarthi03

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants