Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Error after mapping over some rows #8

Open
rawuph opened this issue Sep 16, 2023 · 5 comments
Open

HTTP Error after mapping over some rows #8

rawuph opened this issue Sep 16, 2023 · 5 comments

Comments

@rawuph
Copy link

rawuph commented Sep 16, 2023

I get a reproducible error where a fresh R session works for something like this as long as its sufficiently few rows (<5)

run_chat <- function(call_string) {
  result <- call_string %c% "message"
  return(result)
}

results <- map(test_df$call_string, run_chat)

for larger datasets, i.e. n = 10, i get something like this, although the index at which it occurs varies

HTTP error : Bad Request
Error in map():
ℹ In index: 10.
Caused by error in messages():
! Invalid parameter value. Expecting list or chatlog object.

Backtrace:

  1. ├─purrr::map(test_production$call_string, run_chat)
  2. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
  3. │ ├─purrr:::with_indexed_errors(...)
  4. │ │ └─base::withCallingHandlers(...)
  5. │ ├─purrr:::call_with_cleanup(...)
  6. │ └─global .f(.x[[i]], ...)
  7. │ └─call_string %c% "message"
  8. │ └─TheOpenAIR::chat(message = message, output = output)
  9. │ └─resp %>% messages() %>% add_to_chatlog(chatlog_id)
  10. ├─TheOpenAIR::add_to_chatlog(., chatlog_id)
  11. │ └─base::is.data.frame(msgs)
  12. └─TheOpenAIR::messages(.)
  13. └─base::stop("Invalid parameter value. Expecting list or chatlog object.")

After this error occurred once, simply using the chat() function throws the same error, which is resolved when I restart the session.

@JLDC
Copy link
Collaborator

JLDC commented Sep 18, 2023

Hi @rawuph,

My first suspicion is that this is linked to too many API calls in a short time. Could you please provide us with an MRE? Perhaps a dummy data frame where the problem occurs. For instance, using the following

test_df <- data.frame(call_string=paste("Say", 1:20))

I am not able to reproduce the issue you mentioned.

Thanks.

@rawuph
Copy link
Author

rawuph commented Sep 18, 2023

Thanks for getting back!

Indeed, this works for me as well.

My df contains a much more complex prompt that requires a fair bit of reasoning, perhaps it is related to that.

Here is an (unrelated) MRE that throws an error for me at index 15 at the first run, and index 12 at the 2nd try.

run_chat <- function(call_string) {
  result <- call_string %c% "message"
  return(result)
}

test_df <- data.frame(call_string=paste("Calculate the time it takes for an airplane travelling at 0.244 times the speed of light to reach the sun, if the sun was", 
1:20, 
"times the actual distance from earth"))

mre <- map(test_df$call_string, run_chat)

@umatter
Copy link
Owner

umatter commented Sep 18, 2023

@rawuph : thanks for the MRE. I could reproduce the problem and then briefly tested it by re-writing map as a for-loop:

> testlist <- list()
> length(testlist) <- nrow(test_df)
> for (i in 1:nrow(test_df)){
+   testlist[[i]] <- run_chat(test_df[i, "call_string"])
+   
+ }
HTTP error : Bad Request
Error in messages(.) : 
  Invalid parameter value. Expecting list or chatlog object.
> testlist[[1]]
[1] "To calculate the time it takes for an airplane traveling at 0.244 times the speed of light to reach the sun, we need to know the actual distance from Earth to the sun. The average distance from Earth to the sun is approximately 93 million miles or about 150 million kilometers.\n\nIf we assume that the distance from Earth to the sun is 1 times the actual distance (so, 93 million miles or 150 million kilometers), and the airplane is traveling at 0.244 times the speed of light, we can calculate the time it takes using the formula:\n\nTime = Distance / Speed\n\nGiven:\nDistance = 93 million miles or 150 million kilometers\nSpeed = 0.244 times the speed of light\n\nUsing miles as the unit:\n\nTime = 93 million miles / (0.244 * speed of light)\n\nThe speed of light is approximately 670,616,629 miles per hour.\n\nTime = 93,000,000 miles / (0.244 * 670,616,629 miles/hour)\n     ≈ 190.2 hours or approximately 7.93 days.\n\nUsing kilometers as the unit:\n\nTime = 150 million kilometers / (0.244 * speed of light)\n\nThe speed of light is approximately 1,080,000,000 kilometers/hour.\n\nTime = 150,000,000 kilometers / (0.244 * 1,080,000,000 kilometers/hour)\n     ≈ 54.7 hours or approximately 2.28 days.\n\nPlease note that these calculations are based on assumptions and estimations, and the actual time it would take for an airplane to reach the sun, even at a fraction of the speed of light, is not feasible with current technology."
> testlist[[2]]
[1] "To calculate the time it takes for an airplane traveling at 0.244 times the speed of light to reach the sun, if the sun was 2 times the actual distance from Earth, we need to know the actual distance from Earth to the sun.\n\nThe average distance from Earth to the sun is approximately 93 million miles or about 150 million kilometers. If the sun was 2 times this distance, it would be:\n\nDistance = 2 * 93 million miles = 186 million miles\n\nTo calculate the time it takes, we can use the formula:\n\nTime = Distance / Speed\n\nGiven:\nDistance = 186 million miles\nSpeed = 0.244 times the speed of light\n\nThe speed of light is approximately 670,616,629 miles per hour.\n\nTime = 186,000,000 miles / (0.244 * 670,616,629 miles/hour)\n     ≈ 456.8 hours or approximately 19.03 days.\n\nPlease note that these calculations are based on assumptions and estimations, and the actual time it would take for an airplane to reach the sun, even at a fraction of the speed of light, is not feasible with current technology."
> testlist[[3]]

Basically, this confirm's @JLDC' guess that the OpenAI API issues a RateLimitError. It works for a while and then fails.
The way %c% is implemented is not optimal for such batch processing, I should note. Like chat(), it automatically provides memory of the conversation. The functionality is similar to ChatGPT, but with the disadvantage of extending the context window and thus reaching the rate limit in such cases (as the no. of tokens increases with the no. of iterations).

@JLDC : we should add an option setting to switch memory off, I'll add a ticket to the backlog. also we need a better error-handling for such cases.
@rawuph : it should work if you add a clear_chatlog() in the iteration (or in the function)

testlist <- list()
length(testlist) <- nrow(test_df)
for (i in 1:nrow(test_df)){
  testlist[[i]] <- run_chat(test_df[i, "call_string"])
  clear_chatlog()
  
}

@rawuph
Copy link
Author

rawuph commented Sep 18, 2023

@umatter thank you for looking into it and providing the workaround.

It appears I did not notice that it keeps the conversation's context when you call chat() multiple times, although reading through the documentation again, it is indeed mentioned.

I am essentially trying to find the easiest solution to run the same single-shot prompt consisting of a fixed part A and a variable part B on ~2000 rows. do you expect this would work nicely with clearing the chatlog or should I look for other ways, e.g. openai::create_chat_completion directly?

@umatter
Copy link
Owner

umatter commented Sep 18, 2023

@rawuph : my guess is if your use case is close to your example, the simple trick with clearing the chatlog should do. the lower-level functions like create_chat_completion offer more flexibility, but I don't see that you would actually need it in such a use case. As a side-remark: given the current (rather absent) API-error-handling of OpenAIR, I would in any case recommend using a loop or apply-type of approach to iterate through the rows such that intermediate results are at least temporarily stored in the global env (or maybe even better, written to files on disk).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants