Sync: Rapidly find and track peer canonical heads #811

jlokier · 2021-08-24T14:07:25Z

First component of new sync approach.

This module fetches and tracks the canonical chain head of each connected peer. (Or in future, each peer we care about; we won't poll them all so often.)

This is for when we aren't sure of the block number of a peer's canonical chain head. Most of the time, after finding which block, it quietly polls to track small updates to the "best" block number and hash of each peer.

But sometimes that can get out of step. If there has been a deeper reorg than our tracking window, or a burst of more than a few new blocks, network delays, downtime, or the peer is itself syncing. Perhaps we stopped Nimbus and restarted a while later, e.g. suspending a laptop or Control-Z. Then this will catch up. It is even possible that the best hash the peer gave us in the Status handshake has disappeared by the time we query for the corresponding block number, so we start at zero.

The steps here perform a robust and efficient O(log N) search to rapidly converge on the new best block if it's moved out of the polling window no matter where it starts, confirm the peer's canonical chain head boundary, then track the peer's chain head in real-time by polling. The method is robust to peer state changes at any time.

The purpose is to:

Help with finding a peer common chain prefix ("fast sync pivot") in a consistent, fast and explicit way.
Catch up quickly after any long pauses of network downtime, program not running, or deep chain reorgs.
Be able to display real-time peer states, so they are less mysterious.
Tell the beam/snap/trie sync processes when to start and what blocks to fetch, and keep those fetchers in the head-adjacent window of the ever-changing chain.
Help the sync process bootstrap usefully when we only have one peer, speculatively fetching and validating what data we can before we have more peers to corroborate the consensus.
Help detect consensus failures in the network.

We cannot assume a peer's canonical chain stays the same or only gains new blocks from one query to the next. There can be reorgs, including deep reorgs. When a reorg happens, the best block number can decrease if the new canonical chain is shorter than the old one, and the best block hash we previously knew can become unavailable on the peer. So we must detect when the current best block disappears and be able to reduce block number.

Also:

Add --newsync option and use it. This option enables new blockchain sync and real-time consensus algorithms that
will eventually replace the old, very limited sync.

New sync is work in progress. It's included as an option rather than a code branch, because it's more useful for testing this way, and must not conflict anyway. It's off by default. Eventually this will become enabled by default and the option will be removed.

First component of new sync approach. This module fetches and tracks the canonical chain head of each connected peer. (Or in future, each peer we care about; we won't poll them all so often.) This is for when we aren't sure of the block number of a peer's canonical chain head. Most of the time, after finding which block, it quietly polls to track small updates to the "best" block number and hash of each peer. But sometimes that can get out of step. If there has been a deeper reorg than our tracking window, or a burst of more than a few new blocks, network delays, downtime, or the peer is itself syncing. Perhaps we stopped Nimbus and restarted a while later, e.g. suspending a laptop or Control-Z. Then this will catch up. It is even possible that the best hash the peer gave us in the `Status` handshake has disappeared by the time we query for the corresponding block number, so we start at zero. The steps here perform a robust and efficient O(log N) search to rapidly converge on the new best block if it's moved out of the polling window no matter where it starts, confirm the peer's canonical chain head boundary, then track the peer's chain head in real-time by polling. The method is robust to peer state changes at any time. The purpose is to: - Help with finding a peer common chain prefix ("fast sync pivot") in a consistent, fast and explicit way. - Catch up quickly after any long pauses of network downtime, program not running, or deep chain reorgs. - Be able to display real-time peer states, so they are less mysterious. - Tell the beam/snap/trie sync processes when to start and what blocks to fetch, and keep those fetchers in the head-adjacent window of the ever-changing chain. - Help the sync process bootstrap usefully when we only have one peer, speculatively fetching and validating what data we can before we have more peers to corroborate the consensus. - Help detect consensus failures in the network. We cannot assume a peer's canonical chain stays the same or only gains new blocks from one query to the next. There can be reorgs, including deep reorgs. When a reorg happens, the best block number can decrease if the new canonical chain is shorter than the old one, and the best block hash we previously knew can become unavailable on the peer. So we must detect when the current best block disappears and be able to reduce block number. Signed-off-by: Jamie Lokier <jamie@shareable.org>

This option enables new blockchain sync and real-time consensus algorithms that will eventually replace the old, very limited sync. New sync is work in progress. It's included as an option rather than a code branch, because it's more useful for testing this way, and must not conflict anyway. It's off by default. Eventually this will become enabled by default and the option will be removed. Signed-off-by: Jamie Lokier <jamie@shareable.org>

jangko · 2021-08-25T00:59:27Z

nimbus/sync/sync_types.nim

+  stint, stew/byteutils,
+  eth/[common/eth_types, p2p]
+
+const


probably we can use {.booldefine.} for these tracexxx constants, and then modify it with -d:tracexxx=true/false. it will reduce accidental commits compared to editing the value of these constants.

But we can add it later, when the next component arrived. for now we can merge it.

And instead of using if where the constants are used, we can use when.

KonradStaniec · 2021-08-25T11:34:10Z

This module fetches and tracks the canonical chain head of each connected peer. (Or in future, each peer we care about; we won't poll them all so often.)

This is for when we aren't sure of the block number of a peer's canonical chain head. Most of the time, after finding which block, it quietly polls to track small updates to the "best" block number and hash of each peer.

Just curious, is this polling of each peer really necessary ? Usually, each peer after sucessfull proof of work validation propagates NewBlock message to square root of its peers, and after each import NewBlockHashes message to all its peers, wouldn't tracking peer head based on those messages be enough ?

arnetheduck · 2021-10-05T16:18:08Z

nimbus/config.nim

@@ -138,6 +138,7 @@ type
    verifyFromOk*: bool           ## activate `verifyFrom` setting
    verifyFrom*: uint64           ## verification start block, 0 for disable
    engineSigner*: EthAddress     ## Miner account
+    newSync*: bool                ## --newsync experimental option


this seems over the top, just remove the old broken one and be done with it - it doesn't have value

arnetheduck · 2021-10-05T16:18:55Z

nimbus/sync/chain_head_tracker.nim

+    ## Expansion factor during `SyncHuntBackward` exponential search.
+    ## 2 is chosen for better convergence when tracking a chain reorg.
+
+doAssert syncLockedMinimumReply >= 2


Suggested change

doAssert syncLockedMinimumReply >= 2

static: doAssert syncLockedMinimumReply >= 2

arnetheduck · 2024-05-15T19:12:12Z

No longer relevant

jlokier added 2 commits August 24, 2021 16:56

jlokier force-pushed the jl/head-sync branch from f744c78 to 9ae8cad Compare August 24, 2021 15:56

jangko reviewed Aug 25, 2021

View reviewed changes

arnetheduck reviewed Oct 5, 2021

View reviewed changes

arnetheduck closed this May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync: Rapidly find and track peer canonical heads #811

Sync: Rapidly find and track peer canonical heads #811

jlokier commented Aug 24, 2021 •

edited

jangko Aug 25, 2021 •

edited

KonradStaniec commented Aug 25, 2021

arnetheduck Oct 5, 2021

arnetheduck Oct 5, 2021

arnetheduck commented May 15, 2024

	doAssert syncLockedMinimumReply >= 2
	static: doAssert syncLockedMinimumReply >= 2

Sync: Rapidly find and track peer canonical heads #811

Sync: Rapidly find and track peer canonical heads #811

Conversation

jlokier commented Aug 24, 2021 • edited

jangko Aug 25, 2021 • edited

Choose a reason for hiding this comment

KonradStaniec commented Aug 25, 2021

arnetheduck Oct 5, 2021

Choose a reason for hiding this comment

arnetheduck Oct 5, 2021

Choose a reason for hiding this comment

arnetheduck commented May 15, 2024

jlokier commented Aug 24, 2021 •

edited

jangko Aug 25, 2021 •

edited