Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confirm pivot block - removed disconnect based on chain height estimate #6889

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

macfarla
Copy link
Contributor

@macfarla macfarla commented Apr 5, 2024

PR description

When choosing the "best" peer to try to confirm the pivot block

  • use the default comparator (TTD then chain height) effectively the same
  • remove the pre-requirement for the chain height estimate
  • don't disconnect the "worst" useless peer at this point
  • and update the peer's chain height estimate if the pivot block is confirmed
  • also allow 5 peers below the pivot block height since the chain height estimate we use for comparison can be wildly inaccurate

Fixed Issue(s)

Fixes #6887

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@macfarla
Copy link
Contributor Author

macfarla commented Apr 7, 2024

3 EthPeers {
PeerId: 0x3bd1127b27212d2d... 1297018 height (est) PeerReputation score: 97, timeouts: {1=4, 3=1}, useless: 0, validated? true, disconnected? false, client: erigon/v2.58.1-f12e451c/linux-amd64/go1.21.7, [Connection with hashCode 820003208 inboundInitiated? false initAt 1712461468564], enode://3bd1127b27212d2d165ffaa7ae1e0efbdfdf7ff2022e851dc95271e44bb271446ab05e54035756da7eb6ce86b595992d94f2400ed19a34d9485df81fc12d0879@147.28.163.12:30304, 
PeerId: 0xac906289e4b7f12d... 1296969 height (est) PeerReputation score: 150, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Geth/v1.13.14-stable-2bd6bd01/linux-amd64/go1.21.7, [Connection with hashCode 1951103145 inboundInitiated? false initAt 1712461466663], enode://ac906289e4b7f12df423d654c5a962b6ebe5b3a74cc9e06292a85221f9a64a6f1cfdd6b714ed6dacef51578f92b34c60ee91e9ede9c7f8fadc4d347326d95e2b@146.190.13.128:30303, 
PeerId: 0xa3435a0155a3e837... 1297002 height (est) PeerReputation score: 150, timeouts: {}, useless: 0, validated? true, disconnected? false, client: Geth/v1.13.14-stable-2bd6bd01/linux-amd64/go1.21.7, [Connection with hashCode 455188871 inboundInitiated? false initAt 1712461466663], enode://a3435a0155a3e837c02f5e7f5662a2f1fbc25b48e4dc232016e1c51b544cb5b4510ef633ea3278c0e970fa8ad8141e2d4d0f9f95456c537ff05fdf9b31c15072@178.128.136.233:30303}

@macfarla
Copy link
Contributor Author

macfarla commented Apr 7, 2024

2024-04-07 13:45:03.763	
trailing peer requirements TrailingPeerRequirements{minimumHeightToBeUpToDate=1296943, maxTrailingPeers=5}

2024-04-07 13:45:03.763	
trailing peer requirements TrailingPeerRequirements{minimumHeightToBeUpToDate=0, maxTrailingPeers=9223372036854775807}

@@ -74,7 +74,8 @@ public SyncTargetManager(
protected CompletableFuture<Optional<EthPeer>> selectBestAvailableSyncTarget() {
final BlockHeader pivotBlockHeader = fastSyncState.getPivotBlockHeader().get();
final EthPeers ethPeers = ethContext.getEthPeers();
final Optional<EthPeer> maybeBestPeer = ethPeers.bestPeerWithHeightEstimate();
// just use the default (merge) comparator
final Optional<EthPeer> maybeBestPeer = ethPeers.bestPeer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a step back thinking about PoS networks, no much use for the comparison using TTD since this is basically a constant after chain switches to PoS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually evaluates to the same thing in the end so I'm reverting this line change.

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@macfarla macfarla changed the title confirm pivot block - removed requirement for chain height estimate confirm pivot block - removed disconnect based on chain height estimate Apr 8, 2024
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@macfarla
Copy link
Contributor Author

macfarla commented Apr 9, 2024

I started 3 holesky nodes last night - 2 of 3 did really well, the third one dropped to zero peers (this seems to happen on holesky, workaround is to add the 2 bootnodes using admin_addPeer so they effectively become static peers). The first 2 still have 25 peers and the third one still has only 2 peers (the bootnodes) <- I have seen this on holesky independent of this PR (maybe worth investigating as a discovery issue separately but I don't think it's high priority)
Screenshot 2024-04-09 at 5 15 09 PM

holesky sync also fine (once the zero peers node got peers, it quickly got back in sync)
Screenshot 2024-04-09 at 5 18 24 PM

@macfarla
Copy link
Contributor Author

macfarla commented Apr 9, 2024

2 x mainnet nodes started today, one took a long time (more than 5h) to find peers - mainnet-sally-6889-07 - currently on 24 peers. Both look to be syncing fine (so far)
Screenshot 2024-04-09 at 5 20 13 PM

sync progress
Screenshot 2024-04-09 at 5 20 22 PM

@macfarla macfarla marked this pull request as draft April 9, 2024 11:46
@@ -163,7 +163,7 @@ public void deleteFastSyncState() {
protected FastSyncState updateMaxTrailingPeers(final FastSyncState state) {
if (state.getPivotBlockNumber().isPresent()) {
trailingPeerRequirements =
Optional.of(new TrailingPeerRequirements(state.getPivotBlockNumber().getAsLong(), 0));
Optional.of(new TrailingPeerRequirements(state.getPivotBlockNumber().getAsLong(), 5));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is, we should probably eliminate this TrailingPeer validation (since there's no garantee that peers are going to have their chain height estimate up-to-date) but for test purpose we can easily skip the trailing validation by assigning Long.MAX_VALUE to the second param of the TrailingPeerRequirements constructor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

peer chain state height estimate no longer a good measure of how good a peer is
2 participants