-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: do not retry connection errors #18412
base: dev
Are you sure you want to change the base?
http: do not retry connection errors #18412
Conversation
I spotted that in some rpfixture tests the cluster metadata uploader is running with an incorrect port[^1]. Beside logging lots of connection refused errors, I found it to adding a non-negligible overhead to some variations of the test. The overhead is result of relatively tight loop in the `client::get_connected` method which I'm trying to remove in this commit. I don't believe this is necessary. Returning the error to the caller is much better. The callers usually have better retry mechanisms in place with backoff and can log much more useful messages since they have access to more context about the operation. [^1]: We need to investigate separately why it doesn't get the updated configuration when we configure cloud storage (you can try this test for example redpanda-data#13275) or if it needs to run as well.
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68d6-a47c-4938-bc9d-dd0bfeb2e728:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68d6-a47e-4e6a-b920-32df6730dae7:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68d6-a483-4f85-9068-9801b0c288c6:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68d6-a481-4bd0-b5ee-d6abf440a3af:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68df-c1ab-437d-8f4f-b1eaafda6af1:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68df-c1ae-4bf1-8107-77c9356e41d5:
new failures in https://buildkite.com/redpanda/redpanda/builds/48975#018f68df-c1b1-4f73-a1c1-6b80279eb782:
|
Blocked on #18422 |
current = ss::lowres_clock::now(); | ||
// Any TLS error have to be propagated because it's not | ||
// transient. It won't help to try once again. | ||
if (_as != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misses the deadline .... if above deadline we should not try, no?
Let's see what CI says.
I spotted that in some rpfixture tests the cluster metadata uploader is running with an incorrect port1. Beside logging lots of connection refused errors, I found it to adding a non-negligible overhead to some variations of the test. The overhead is result of relatively tight loop in the
client::get_connected
method which I'm trying to remove in this commit.I don't believe this is necessary. Returning the error to the caller is much better. The callers usually have better retry mechanisms in place with backoff and can log much more useful messages since they have access to more context about the operation.
Backports Required
Release Notes
Footnotes
We need to investigate separately why it doesn't get the updated configuration when we configure cloud storage (you can try this test for example https://github.com/redpanda-data/redpanda/issues/13275) or if it needs to run as well. ↩