Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node #5628

Merged
merged 4 commits into from
May 22, 2024

Conversation

jimmygchen
Copy link
Member

Issue Addressed

Addresses #4388.

A large number of unknown validators on a VC is known to overwhelm the beacon node because the VC triggers a retrieval for each validator per slot. This PR reduces the frequency to one query per unknown validator each epoch.

Proposed Changes

  1. Poll for all validator indices on startup (unchanged)
  2. If any validator is unknown, register to poll again in the 32 slots (1 epoch) instead of the next slot.
  3. Avoid polling on the first 1st slot of epoch.

For more details and rationale, see comment here.

@jimmygchen jimmygchen added val-client Relates to the validator client binary ready-for-review The code is ready for review labels Apr 23, 2024
@jimmygchen jimmygchen force-pushed the reduce-polling-pending-validators branch from 097dafd to ce914d1 Compare April 23, 2024 07:24
Copy link
Member

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM. Just a minor question

validator_client/src/duties_service.rs Show resolved Hide resolved
@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Apr 26, 2024
@jimmygchen
Copy link
Member Author

I'll do a bit of manual testing next week before we merge this!

@jimmygchen jimmygchen self-assigned this Apr 26, 2024
@jimmygchen
Copy link
Member Author

@chong-he found an issue during testing and it looks like the VC re-queries the BN during the first slot of the epoch, I'll look into this.

@jimmygchen
Copy link
Member Author

Upon investigating the logs, I think it is working as intended, and the poll in the first slot of epoch is skipped. Will wait for CK to confirm.

However, I think there's likelihood that this could happen due to the async nature of this function, in the scenario where BN returns a response really late - basically if iterating through all the validators takes close to 12 seconds, then we could be querying in a new slot, which could potentially be first slot of a new epoch.

This is because of the calculation outside the for loop:

let current_slot_opt = duties_service.slot_clock.now();
let next_poll_slot_opt = current_slot_opt.map(|slot| slot.saturating_add(E::slots_per_epoch()));
let is_first_slot_of_epoch = if let Some(current_slot) = current_slot_opt {
let current_epoch_first_slot = current_slot
.epoch(E::slots_per_epoch())
.start_slot(E::slots_per_epoch());
current_slot == current_epoch_first_slot
} else {
false
};
for pubkey in all_pubkeys {
// This is on its own line to avoid some weirdness with locks and if statements.
let is_known = duties_service
.validator_store
.initialized_validators()
.read()
.get_index(&pubkey)
.is_some();

For more accuracy, I'll move the calculation into the for loop given we await on each validator query and the slot status could be inaccurate if we have a large list or slow BN.

@chong-he
Copy link
Member

chong-he commented May 6, 2024

@chong-he found an issue during testing and it looks like the VC re-queries the BN during the first slot of the epoch, I'll look into this.

Apologies! I was looking at the wrong slot time.

The VC does not re-query on the first slot in the next epoch (even though we start the VC in the first slot). It waits until the 12s has lapsed and only re-query the BN about the status (i.e., query at the second slot of an epoch). The VC also only query once per epoch, which is a great improvement.

All good upon testing, this PR is good to go

@jimmygchen jimmygchen added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels May 7, 2024
@michaelsproul michaelsproul added the v5.2.0 Q2 2024 label May 21, 2024
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nice improvement

I think we should include this in 5.2

validator_client/src/duties_service.rs Outdated Show resolved Hide resolved
@michaelsproul michaelsproul added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels May 21, 2024
Co-authored-by: Michael Sproul <micsproul@gmail.com>
@michaelsproul
Copy link
Member

sorry looks like cargo fmt is failing because I messed up the indentation in my suggestion

@jimmygchen
Copy link
Member Author

My bad, should've checked myself :P thanks!

@jimmygchen jimmygchen added ready-for-review The code is ready for review ready-for-merge This PR is ready to merge. and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. ready-for-review The code is ready for review ready-for-merge This PR is ready to merge. labels May 21, 2024
@michaelsproul
Copy link
Member

@Mergifyio queue

Copy link

mergify bot commented May 22, 2024

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 52e3112

mergify bot added a commit that referenced this pull request May 22, 2024
@mergify mergify bot merged commit 52e3112 into sigp:unstable May 22, 2024
28 checks passed
@jimmygchen jimmygchen deleted the reduce-polling-pending-validators branch May 28, 2024 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-review The code is ready for review v5.2.0 Q2 2024 val-client Relates to the validator client binary
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants