Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peering - disconnect peers that have multiple discovery ports #7089

Merged
merged 11 commits into from May 12, 2024

Conversation

pinges
Copy link
Contributor

@pinges pinges commented May 10, 2024

PR description

In discovery we seemed to see a lot of "misconfigured" nodes that share the same IP address and TCP port, but use different discovery ports. This PR checks for these nodes and removes them from the peer table.

Fixes #6805

pinges and others added 8 commits May 8, 2024 13:58
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
@macfarla
Copy link
Contributor

very cool find @pinges - are there more tests you wanted to add?

@macfarla
Copy link
Contributor

11 holesky nodes - 10 of them get to 100% peers within 2h
Screenshot 2024-05-10 at 2 43 23 PM

the 11th one eventually gets to 100% peers (although it does take ~7h)
Screenshot 2024-05-10 at 2 44 21 PM

Current behaviour on holesky is > 90% of nodes get stuck at 0 peers indefinitely (require setting static peers to sync)

See #6805

@macfarla
Copy link
Contributor

16 mainnet nodes - all of them get to 100% peers within 1h.
Screenshot 2024-05-10 at 2 48 38 PM

Current behaviour on mainnet is a small percentage - maybe 5% - of nodes get stuck on 0 peers and require a restart.
example - restarted at 23:00 which is when it starts to get peer connections.
Screenshot 2024-05-10 at 2 51 09 PM

@macfarla macfarla changed the title Investigate peering issues Peering - disconnect peers that have multiple discovery ports May 10, 2024
Copy link
Contributor

@macfarla macfarla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a changelog entry

Copy link
Contributor

@macfarla macfarla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just needs a changelog entry

private final LHMWithMaxSize<String, Integer> ipAddressCheckMap =
new LHMWithMaxSize<>(DEFAULT_BUCKET_SIZE * N_BUCKETS);
private final CircularFifoQueue<String> invalidIPs =
new CircularFifoQueue<>(DEFAULT_BUCKET_SIZE * N_BUCKETS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious - why these specific collection types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CircularFifo is a simple queue with a maximum size.
LHMWithMaxSize is a private class, it is a LinkedHashMap that I have extended to have a maximum size.
Lukas complained about the name of the class and I will rename it :-)

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@@ -51,6 +54,10 @@ public class PeerTable {
private final Map<Bytes, Integer> distanceCache;
private BloomFilter<Bytes> idBloom;
private int evictionCnt = 0;
private final LHMWithMaxSize<String, Integer> ipAddressCheckMap =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename LHMWithMaxSize as LinkedHashMapWithMaxSize to be clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do :-)

Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net>
@pinges pinges merged commit 5fa1750 into hyperledger:main May 12, 2024
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Low Peer Numbers
3 participants