RLPx Node Discovery Protocol
Node: an entity on the network
Node ID: 512 bit public key of node
The Node Discovery protocol provides a way to find RLPx nodes that can be connected to. It uses a Kademlia-like protocol to maintain a distributed database of the IDs and endpoints of all listening nodes.
Each node keeps a node table as described in the Kademlia paper
[Maymounkov, Mazières 2002]. The node table is configured
with a bucket size of 16 (denoted k
in Kademlia), and concurrency of 3
(denoted α
in Kademlia). The idle bucket-refresh interval is
3600 seconds.
In order to maintain a well-formed network, RLPx nodes should try to connect to an unspecified number of close nodes. To increase resilience against Sybil attacks, nodes should also connect to randomly chosen, non-close nodes.
Each node runs the UDP-based RPC protocol defined below. The
FIND_DATA
and STORE
requests from the Kademlia paper are not part
of the protocol since the Node Discovery Protocol does not provide DHT
functionality.
When joining the network, fills its node table by perfoming a
recursive Find Node operation with its own ID as the Target
. The
initial Find Node request is sent to one or more bootstrap nodes.
RLPx nodes that want to accept incoming connections should listen on the same port number for UDP packets (Node Discovery Protocol) and TCP connections (RLPx protocol).
All requests time out after are 300ms. Requests are not re-sent.
UDP packets are structured as follows:
Offset | ||
---|---|---|
0 | MDC | Ensures integrity of packet, `SHA3(signature |
32 | signature | Ensures authenticity of sender, `SIGN(sender-privkey, SHA3(type |
97 | type | Single byte in range [1, 4] that determines the structure of Packet Data |
98 | data | RLP encoded, see section Packet Data |
The packets are signed and authenticated. The sender's Node ID is determined by recovering the public key from the signature.
sender-pubkey = ECRECOVER(signature)
The integrity of the packet can be verified by computing the expected MDC of the packet as:
MDC = SHA3(signature || type || data)
All packets contain an Expiration
date to guard against replay attacks.
The date should be interpreted as a UNIX timestamp.
The receiver should discard any packet whose Expiration
value is in the past.
Ping packets can be sent and received at any time. The receiver should reply with a Pong packet and update the IP/Port of the sender in its node table.
RLP encoding: [ IP
, Port
, Expiration
]
Element | |
---|---|
IP |
IP address (ASCII string) on which the node is listening |
Port |
listening port of the node |
Pong is the reply to a Ping packet.
RLP encoding: [ Reply Token
, Expiration
]
Element | |
---|---|
Reply Token |
content of the MDC element of the Ping packet |
Find Node packets are sent to locate nodes close to a given target ID.
The receiver should reply with a Neighbors packet containing the k
nodes closest to target that it knows about.
RLP encoding: [ Target
, Expiration
]
Element | |
---|---|
Target |
is the target ID |
Neighbors is the reply to Find Node. It contains up to k
nodes that
the sender knows which are closest to the requested Target
.
RLP encoding: [ [ Node₁
, Node₂
, ..., Nodeₙ
], Expiration
]
Each Node
is a list of the form [ IP
, Port
, ID
]
Element | |
---|---|
IP |
IP address (ASCII string) on which the node is listening |
Port |
listening port of the node |
ID |
The advertised node's public key |