Round quorum before advancement #2897

vicsn · 2023-12-05T18:55:08Z

Motivation

Open questions

we can consider merging AddressWithCoordinate<N> upstream into snarkVM

Performance

In the tradition of Lukasz brilliance, I'm using Vecs+BinarySearch instead of maps.

In this repo you can find some benchmarks of different constructions. If we don't check quorum, we can do 250 updates per millisecond, if we do check quorum each time because we get increased round numbers, we can do 9 updates per millisecond. I expect on average an update will cost closer to 1/250 milliseconds. Moreover, an update will only be done if any peer actually sends a very high round number.

Also note that currently computing quorum requires us to collect into a HashSet, we could add a version which just uses an iterator (checking along the way if all elements are unique).

Persistence

I'm not storing the cache in the database, we should expect to receive a quorum's worth of updates from other nodes every round.

node/bft/src/primary.rs

node/bft/src/helpers/round_cache.rs

niklaslong · 2023-12-06T16:14:40Z

Nice benches, I'm almost surprised BTreeMap wasn't the fastest with multiple inserts.

node/bft/src/helpers/mod.rs

node/bft/tests/narwhal_e2e.rs

niklaslong · 2023-12-06T17:07:51Z

Do we usually run all ignore = tests manually, if so how?

Most ignored e2e tests are long running (e.g. manual testing) or are too costly to perform on CI. We run them locally from time to time as a sanity check but ideally, our coverage should be complete with regular tests.

node/bft/src/helpers/round_cache.rs

node/bft/src/helpers/cache_round.rs

raychu86 · 2024-02-12T02:52:16Z

node/bft/src/primary.rs

+        // If our peer is far ahead, check if a quorum of peers is ahead and consider updating our committee.
+        } else if is_peer_far_in_future {
+            // Get the highest round seen from a quorum of the current committee
+            let committee = self.ledger.get_committee_for_round(self.current_round())?;


This should be the get_previous_committee_for_round.

In the mainnet branch this should be changed to get_committee_lookback_for_round.

e53bdb0

Let me know if you need me to create a new PR targeting mainnet.

node/bft/src/helpers/cache_round.rs

raychu86 · 2024-02-12T03:07:12Z

node/bft/src/helpers/cache_round.rs

+
+        // Check if we reached quorum on a new round
+        if inserted {
+            while committee.is_quorum_threshold_reached(&self.validators_in_support(committee)?) {


The committee being used here may changed based on the round you are checking.

I think its fine to use the single given committee (from get_previous_committee_for_round) to update completely, because who cares whether old outdated committees have high rounds..

Sidenote: if we were to only update by 1 round at a time based on the given committee, instead of always taking just <quorum> batch proposals to update to the correct round, we will then need <quorum> + i batch proposals to increase our round by i. If <quorum> > max_gc_rounds, then we might never catch up...

Is there any case where you won't ever be able to see quorum because the fixed committee you are using doesn't include newly bonded validators?

Great question, it seems so yes... Because we use our old outdated round to determine the committee: self.ledger.get_previous_committee_for_round(self.current_round()).

The only alternative I see now is to use the peer's advertised batch_round - but one question on that: is it possible that we don't know the committee yet for rounds far in the future so that get_previous_committee_for_round(batch_round)? will keep failing? Or do we expect that to succeed at some point?

raychu86 · 2024-02-12T03:09:24Z

node/bft/src/helpers/cache_round.rs

+    }
+
+    /// Insert a validator at a round
+    fn insert_validator_at_round(&mut self, round: u64, validator: AddressWithCoordinate<N>) {


Why doesn't this add the validator to address_rounds?

The function is badly named, we are only updating because we saw a new round for this validator: ef446e9

node/bft/src/helpers/cache_round.rs

node/bft/src/primary.rs

ljedrz

I can't test it right now, but LGTM code-wise 👍

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com> Signed-off-by: vicsn <victor.s.nicolaas@protonmail.com>

howardwu · 2024-03-09T00:37:59Z

@vicsn in light of major code drift from testnet3, can you rebase this PR onto mainnet-staging and test this PR on a network to confirm its validity/necessity?

vicsn · 2024-03-11T10:11:17Z

Yes, will coordinate and help Michel to first get this related issue over the finish line to see if it resolves the attack: #3119

joske · 2024-03-15T14:26:25Z

node/bft/src/helpers/cache_round.rs

+    /// The current highest round which has (stake-weighted) quorum
+    last_highest_round_with_quorum: u64,
+    /// A sorted list of (round, Vec<AddressWithCoordinate<N>>), indicating the last seen highest round for each address
+    highest_rounds: Vec<(u64, Vec<AddressWithCoordinate<N>>)>,


Why does this Vec have to be sorted? It seems only to quickly fetch an item? If that is true, why not use HashMap? This gives you O(1) lookups and O(1) inserts. Now you have O(n) inserts and O(log n) lookups.

Will come back to this if we decide to keep this PR

it's faster and more lightweight, unless the collection can become arbitrarily large

vicsn · 2024-03-29T09:37:48Z

Deprecating this in favour of #3119 - which has been tested and works

vicsn requested a review from niklaslong December 5, 2023 18:55

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/src/primary.rs Outdated Show resolved Hide resolved

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/src/helpers/round_cache.rs Outdated Show resolved Hide resolved

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/src/helpers/round_cache.rs Outdated Show resolved Hide resolved

vicsn mentioned this pull request Dec 6, 2023

Derive Ord + PartialOrd to compare Address AleoNet/snarkVM#2231

Closed

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/src/helpers/mod.rs Outdated Show resolved Hide resolved

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/tests/narwhal_e2e.rs Outdated Show resolved Hide resolved

niklaslong reviewed Dec 6, 2023

View reviewed changes

node/bft/src/helpers/round_cache.rs Outdated Show resolved Hide resolved

vicsn force-pushed the round_quorum_before_advancement branch from a96627d to 668dc7b Compare December 6, 2023 20:38

vicsn requested a review from ljedrz December 6, 2023 20:38

vicsn added 3 commits December 7, 2023 11:37

Add RoundCache: only update round if quorum has a higher round

35a7416

Rename Cache to PeerCache

e65b1d6

tighten test duration

3ff7e91

vicsn force-pushed the round_quorum_before_advancement branch from 668dc7b to 3ff7e91 Compare December 7, 2023 10:50

Only check is_quorum_far_in_future when is_peer_far_in_future

f412ffe

vicsn marked this pull request as ready for review December 7, 2023 12:39

ljedrz reviewed Dec 7, 2023

View reviewed changes

node/bft/src/helpers/cache_round.rs Show resolved Hide resolved

ljedrz reviewed Dec 7, 2023

View reviewed changes

node/bft/src/helpers/cache_round.rs Outdated Show resolved Hide resolved

vicsn and others added 4 commits December 7, 2023 17:07

Do not preallocate unknown amount

8049d6e

Use a plain ol' Vec

ad551aa

Merge branch 'testnet3' into round_quorum_before_advancement

3f9083f

Merge branch 'testnet3' into round_quorum_before_advancement

aee7fe3

raychu86 reviewed Feb 12, 2024

View reviewed changes

node/bft/src/helpers/cache_round.rs Show resolved Hide resolved

raychu86 reviewed Feb 12, 2024

View reviewed changes

raychu86 and others added 2 commits February 11, 2024 19:21

Add comments

1591196

Ensure we update the round cache when seeing a round with quorum

c8fff29

vicsn added 2 commits February 12, 2024 12:54

Use get_previous_committee_for_round

e53bdb0

Improve naming

ef446e9

howardwu requested review from ljedrz, niklaslong and joske February 14, 2024 01:53

ljedrz reviewed Feb 16, 2024

View reviewed changes

node/bft/src/helpers/cache_round.rs Outdated Show resolved Hide resolved

ljedrz reviewed Feb 16, 2024

View reviewed changes

node/bft/src/helpers/cache_round.rs Outdated Show resolved Hide resolved

ljedrz reviewed Feb 16, 2024

View reviewed changes

node/bft/src/primary.rs Outdated Show resolved Hide resolved

ljedrz approved these changes Feb 16, 2024

View reviewed changes

vicsn and others added 3 commits February 16, 2024 13:58

Update node/bft/src/helpers/cache_round.rs

10827d0

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com> Signed-off-by: vicsn <victor.s.nicolaas@protonmail.com>

Update node/bft/src/primary.rs

ab7094e

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com> Signed-off-by: vicsn <victor.s.nicolaas@protonmail.com>

Update node/bft/src/helpers/cache_round.rs

5c6ff28

Co-authored-by: ljedrz <ljedrz@users.noreply.github.com> Signed-off-by: vicsn <victor.s.nicolaas@protonmail.com>

raychu86 mentioned this pull request Feb 23, 2024

[Fix] Updates conditions to advance from odd and even rounds #3119

Merged

howardwu marked this pull request as draft March 9, 2024 00:38

joske reviewed Mar 15, 2024

View reviewed changes

vicsn closed this Mar 29, 2024

vicsn deleted the round_quorum_before_advancement branch March 29, 2024 09:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Round quorum before advancement #2897

Round quorum before advancement #2897

vicsn commented Dec 5, 2023 •

edited

niklaslong commented Dec 6, 2023

niklaslong commented Dec 6, 2023 •

edited

raychu86 Feb 12, 2024

vicsn Feb 12, 2024

raychu86 Feb 12, 2024

vicsn Feb 12, 2024

raychu86 Feb 13, 2024

vicsn Feb 14, 2024

raychu86 Feb 12, 2024

vicsn Feb 12, 2024

ljedrz left a comment

howardwu commented Mar 9, 2024

vicsn commented Mar 11, 2024

joske Mar 15, 2024

vicsn Mar 15, 2024

ljedrz Mar 15, 2024

vicsn commented Mar 29, 2024

Round quorum before advancement #2897

Round quorum before advancement #2897

Conversation

vicsn commented Dec 5, 2023 • edited

Motivation

Open questions

Performance

Persistence

niklaslong commented Dec 6, 2023

niklaslong commented Dec 6, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ljedrz left a comment

Choose a reason for hiding this comment

howardwu commented Mar 9, 2024

vicsn commented Mar 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vicsn commented Mar 29, 2024

vicsn commented Dec 5, 2023 •

edited

niklaslong commented Dec 6, 2023 •

edited