-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disconnect on a malformed version message, and reject duplicate verack messages #6781
Disconnect on a malformed version message, and reject duplicate verack messages #6781
Conversation
@AndreaLanfranchi I used " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 54bdbf5
src/main.cpp
Outdated
pfrom->nVersion = nVersion; | ||
pfrom->nServices = nServices; | ||
|
||
if (!vRecv.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these check the length remaining to ensure sufficient bytes remain for successful parsing? Or is the approach here to rely upon exception handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The length of the message is allowed to be either the minimum length (just the mandatory fields), or exactly one of the lengths obtained by adding each optional set of fields in turn, or anything longer than the longest such length (to allow for future expansion). That's exactly what the current code allows, and yes, we're relying on exception handling to reject anything else.
Since this potentially affects peer-to-peer protocol compatibility (and involves parsing code and exceptions in C++), I'd like two reviews please :-) I suggest commit-by-commit review, with whitespace changes ignored for the third commit. |
Incidentally, I also checked that Deserialization of Line 242 in 883fa68
|
Hi @daira, |
@AndreaLanfranchi wrote:
In that case it can stay as-is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed as of 54bdbf5.
src/main.cpp
Outdated
LOCK(cs_main); | ||
CNodeState* state = State(pfrom->GetId()); | ||
if (state->fCurrentlyConnected) { | ||
pfrom->PushMessage("reject", strCommand, REJECT_DUPLICATE, std::string("Duplicate verack message")); | ||
Misbehaving(pfrom->GetId(), 1); | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed that State()
is used elsewhere without being guarded by pfrom->fNetworkNode
(which makes sense as it is intended for decoupling node-relevant network state from the network locks).
Currently the new reject will never trigger when a duplicate verack
message is received from a non-fNetworkNode
, which AFAICT includes all inbound connections (fNetworkNode
is only set to true in OpenNetworkConnection
for outbound connections). This implies that if we want to reject duplicate verack messages from all non-local CNode
s (both outbound and inbound), then we need a new boolean to track that state (because fCurrentlyConnected
is from upstream, and so we shouldn't go changing its semantics inside the node).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the presence, actually, of two separate booleans (fCurrentlyConnected
from State and fSuccessfullyConnected
from CNode member) are a bit confusing and, IMHO, misleading. Wrt specifically to the latter its meaning is ambiguous as it may appear as a redundant statement of what already is in State but at the same time it doesn't mean the handshake process has been fully completed (exchange amongst the two ends of the socket of both version
and verack
).
I believe it would be beneficial to move the valuation of fSuccessfullyConnected
to true only after verack
message has been successfully parsed and processed and at the same time move there as well :
- the
getaddr
request (if applicable) - the advertising of local address (if applicable)
- the tagging as "good" (in addrman) of the sender's address (if applicable): actually a "good" address, IMHO, should be tagged as such only after a succesfully completed protocol handshake.
In addition to that any ping
sendout should be prohibited if the handshake has not been completed in full.
All of this should be helpful to solve issues #5168 and #6038 (protocol handshake should NOT be interleaved by any message other than version and verack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also worth mentioning on the matter that latest bitcoin core implementation totally disregards the 26 bytes of the "from" address in version message while acknowledging the remote peer address from its addr messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As #6038 demonstrates, we cannot drop messages that are received before we have sent "verack", because previous versions of zcashd send them at that point and expect a response. We can stop sending them before we have received "verack", but we must accept them when received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As #6038 demonstrates, we cannot drop messages that are received before we have sent "verack", because previous versions of zcashd send them at that point and expect a response. We can stop sending them before we have received "verack", but we must accept them when received.
I think abstain from consuming them before the handshake has been completed in full could be a valuable intermediate solution. Anyhow, just for my knowledge and education, latest bitcoin core's code base have moved the valuation of fSuccessfullyConnected
flag from CNode's into the processing of verack
message.
In any case I'm not pushing to have more changes in this PR specifically.
54bdbf5
to
09637cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 09637cd with nonblocking question.
>= MIN_PEER_PROTO_VERSION (170100) at this point. Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
* fields were deserialized into variables of platform-dependent length; * pfrom->nVersion was set even if it does not meet the minimum version requirements; * make sure that pfrom->nTimeOffset is set before pfrom->fSuccessfullyConnected. At this point it is still possible for parsing to fail due to a `std::ios_base::failure`, which will not cause a disconnect. Co-authored-by: Andrea Lanfranchi <andrea.lanfranchi@gmail.com> Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
Co-authored-by: Andrea Lanfranchi <andrea.lanfranchi@gmail.com> Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
Co-authored-by: Andrea Lanfranchi <andrea.lanfranchi@gmail.com> Signed-off-by: Daira Emma Hopwood <daira@jacaranda.org>
This should make occurrences of NodeId wrapping essentially impossible for real-world usage. Backport of bitcoin/bitcoin#10176 [zcashd] I have checked that zcashd has no current uses of NodeId that depend on it being an `int`. All accesses to the global `nLastNodeId` are under lock. `NodeId` *is* required to be a signed integral type, because `-1` is used as a sentinel value. It is also formatted using the `%d` tinyformat specifier, but unlike the C format specifier it is inspired by, this correctly handles integral types of arbitrary width. There are `NodeId` fields in `CNodeStats`, `NodeEvictionCandidate`, and (test-only) `COrphanTx`, but those types are not serializable, and there is no other ad-hoc serialization of `NodeId` values apart from its use in the "id" field of the output from the `getpeerinfo` RPC. `UniValue` has an override of `pushKV` for `int64_t`, and so that use will correctly handle values up to the [maximum safe JSON/JavaScript integer](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER), i.e. $2^{53} - 1$. As upstream did, we argue that it is not feasible to cause that value to be exceeded. Co-authored-by: Cory Fields <cory-nospam-@coryfields.com> Signed-off-by: Daira-Emma Hopwood <daira@jacaranda.org>
09637cd
to
abe934c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK abe934c.
I re-confirmed that, per my above comment, this PR only rejects duplicate verack messages for outbound connections (because it reuses the existing state->fCurrentlyConnected
boolean which is used for tracking outbound connection attempts to addresses in addrman
). If duplicate verack rejection is desired for inbound connections, that will need to be implemented in a separate PR with a new boolean that isn't coupled to outbound connection attempts.
This is loosely based on and supercedes #6780 by @AndreaLanfranchi, but takes a different approach. It makes the following changes:
std::ios_base::failure
, the "version" message will be rejected and the peer will be marked to disconnect (pfrom->fDisconnect = true
).nVersion
andnServices
fields are no longer set (i.e. they will be left at 0) until after the mandatory fields of a "version" message have been successfully parsed and validated. Therefore, thenVersion
field will never be set to a non-zero value belowMIN_PEER_PROTO_VERSION
(orMIN_TESTNET_PEER_PROTO_VERSION
for Testnet). This makes the peer version handling simpler to reason about.pfrom->fSuccessfullyConnected
is not set until afterpfrom->nTimeOffset
has been initialized.