Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backoff: reset on successful event poll #564

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chrisirhc
Copy link
Contributor

@chrisirhc chrisirhc commented Mar 13, 2024

Fixes: #554

@chrisbobbe
Copy link
Collaborator

I've just edited the issue description to reference the issue it looks like this is about. Please correct it if it's wrong, but if it's right, please add a "Fixes" line to the relevant commit message too, once it's ready to be a not-draft. 🙂

@chrisirhc
Copy link
Contributor Author

Ah yes, I created this but forgot about it.
I saw a different way of approaching this by reseting the reference to backoffMachine as null instead of maintaining this reset state over at:

BackoffMachine? backoffMachine;
while (true) {
try {
return await registerQueue(connection);
} catch (e) {
assert(debugLog('Error fetching initial snapshot: $e\n'
'Backing off, then will retry…'));
// TODO tell user if initial-fetch errors persist, or look non-transient
await (backoffMachine ??= BackoffMachine()).wait();
assert(debugLog('… Backoff wait complete, retrying initial fetch.'));
}
}

I could've gone down that approach as well, to keep the codebase consistent. Let me know if there's an opinion on this.

@chrisirhc chrisirhc marked this pull request as ready for review March 27, 2024 01:56
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisirhc for this PR! I'm back from vacation this week (and the rush of GSoC applicants is over), and I'm starting to catch up now.

One substantive comment below.

Also a commit-message nit: the summary line after the colon should be in sentence case. So e.g. "store: Reset backoff on success", rather than "store: reset …".

Comment on lines +63 to +64
final durationReset1 = await measureWait(backoffMachine.wait());
check(durationReset1).isLessThan(duration2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will flake — it'll fail about 25% of the time. For example:

$ time flutter test test/api/backoff_test.dart 
00:01 +1 -1: BackoffMachine.reset [E]                                            
  Expected: a Duration that:
    is less than <0:00:00.023591>
  Actual: <0:00:00.049613>
  Which: is not less than <0:00:00.023591>
  package:checks/src/checks.dart 85:9             check.<fn>
  package:checks/src/checks.dart 708:12           _TestContext.expect
  package:checks/src/extensions/core.dart 167:13  ComparableChecks.isLessThan
  test/api/backoff_test.dart 64:27                main.<fn>

The trouble is that the behavior of wait is randomized, and the range of possible durations for the second wait (or even the Nth wait for large N) overlaps with the range for the first wait.

That randomization is why the existing test for that method, above, runs 100 trials and then makes a fairly loose check on the results; the parameters are chosen, using some math that's in the comments, to ensure the flake rate is below one in a billion.

(I picked that threshold so that even in a large codebase, even with lots of randomized test cases using that threshold, the total flake rate due to randomization would be negligible compared to other sources of flakiness like races, other bugs, and infra issues.)

We totally could write a working test for this reset method by using the same strategy. But maybe the complexity of that is a good reason to skip the reset method and go with the other approach you mentioned at #564 (comment) , of just resetting a BackoffMachine? local to null.

@gnprice gnprice added the completion candidate PR with reviews that may unblock merging label May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
completion candidate PR with reviews that may unblock merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reset backoff interval after successful connection
3 participants