Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(HTTP): Retry failed HTTP requests with exponential backoff #1098

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mariocynicys
Copy link

This PR implements exponential backoff retrials for http files.

I am not quite familiar with the code base & how packager works, so I could use some help in clarifying the following:

  • In case of PUT or POST requests, the upload_cache_ would be drained, right? Does this mean we have to keep a copy of it?
  • Are there any other cases we should perform retries in? Other than client timeouts and these specific response code.

Tests yet to be added.
Closes #1095

long res_code = 0;
curl_easy_getinfo(curl_.get(), CURLINFO_RESPONSE_CODE, &res_code);
if (res_code == 408 || res_code == 429 || res_code >= 500) {
// FIXME: Should we clear download_cache_ and upload_cache_ here??
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you'll need to ensure that the inside of the loop has no side-effects. I haven't analyzed to see what the caches do here, but it's worth looking into more closely.

You may want to write test cases that verify the correct behavior for POST/PUT with data. In the cmake branch, you'll find HTTP file tests are all enabled and working. (In main, they are all disabled.) You should be able to verify retry behavior with httpbin, and see that the final failed retry still has a response that reflects the data from the POST/PUT.

I recommend you try the cmake branch in a separate clone, without using gclient. We have yet to update the docs for that, but this is what you'll want to try:

# Clone
git clone ...
# Use cmake branch
git checkout -b cmake upstream/cmake
# Get submodules
git submodule init && git submodule update
# Gen build files
cmake -S . -B build/ -DCMAKE_BUILD_TYPE=Debug
# Build
cmake --build build/ --config Debug --parallel
# Test
(cd build && ctest -C Debug -V)

If you have time, you could make a PR against the cmake branch, which is where most development is happening right now as we rewrite the build system and dependencies.

@joeyparrish
Copy link
Member

Please merge with main, which has changed a lot in the recent cmake port. Thanks!

@cosmin
Copy link
Collaborator

cosmin commented Feb 21, 2024

This has been rebased to main and the tests do pass, but from some digging into the implications of download and upload cache I'm not sure whether this can be done safely. For example let's think of the case of using this to upload packaging output to some HTTP server.

As soon as CURL calls CurlReadCallback the underlying circular buffer is going to notify potential writers that were blocked on writing to the circular cache that there is room to write. And those writers may start writing right away, overwriting the data in the cache. Meanwhile the request fails and needs to be retried, but the underlying circular buffer was altered so the request cannot be retried.

I don't think this can be done safely with the currently IoCache interface as the backing buffers. I believe the data from the upload_cache would have to written to some other intermediary buffer (without triggering read events) so that curl and retry safely, and only upon success should the IoCache circular buffer signal the read event and advance the pointers.

Likewise for download cache, only upon successful download should the data be written to download_cache otherwise a consumer may have already consumed the bad response body by the time we retry.

Alternatively to avoid the memory copies I think we'd have to take a lock on IoCache around the curl_easy_perform loop and use some lower level interfaces than Read() and Write() (perhaps something like MaybeRead() and MaybeWrite()) and then explicitly release the lock and trigger read event once the request was successful.

@cosmin cosmin marked this pull request as draft February 23, 2024 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement exponential backoff for HTTP PUT
3 participants