Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/BIP324: disconnection scenarios during v2 handshake #29431

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

stratospher
Copy link
Contributor

Add tests for the following v2 handshake scenarios:

  1. Disconnection happens when > MAX_GARBAGE_LEN bytes garbage is sent
  2. Disconnection happens when incorrect garbage terminator is sent
  3. Disconnection happens when garbage bytes are tampered with
  4. Disconnection happens when AAD of first encrypted packet after the garbage terminator is not filled
  5. bitcoind ignores non-empty version packet and no disconnection happens

All these tests require a modified v2 P2P class (different from EncryptedP2PState used in v2_p2p.py) to implement our custom handshake behaviour based on different scenarios and have been kept in a single test file (test/functional/p2p_v2_misbehaving.py). Shifted the test in test/functional/p2p_v2_earlykeyresponse.py which is of the same pattern to this file too.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 14, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK sr-gi
Concept ACK theStack, kevkevinpal, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
  • #29500 (test: create assert_not_equal util by kevkevinpal)
  • #29420 (test: extend the SOCKS5 Python proxy to actually connect to a destination by vasild)
  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
  • #28521 (net: additional disconnect logging by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@stratospher
Copy link
Contributor Author

@sr-gi, i tried this manually sending idea but i still think intermittent failures are possible there.

  • we can't get rid of can_data_be_received variable because if we don't use this variable, test would succeed irrespective of whether we send 4 bytes network magic first or 4 bytes from ellswift bytes first and we don't want that.
  • so since data_received() always happens in Network thread and send of ellswift bytes + setting can_data_be_received=True happens on MainThread, in the rare scenario that data_received() gets called before setting can_data_be_received, an intermittent failure could happen i think.

here's a branch where i tweaked the code you shared a bit with a sleep statement for making the test crash and reintroducing can_data_be_received variable.

@theStack
Copy link
Contributor

Concept ACK

Copy link
Contributor

@kevkevinpal kevkevinpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

added two comments I will continue to test these changes as I am still trying to understand fully what the tests do exactly

test/functional/p2p_v2_misbehaving.py Show resolved Hide resolved
test/functional/p2p_v2_misbehaving.py Show resolved Hide resolved
@sr-gi
Copy link
Member

sr-gi commented Feb 29, 2024

  • we can't get rid of can_data_be_received variable because if we don't use this variable, test would succeed irrespective of whether we send 4 bytes network magic first or 4 bytes from ellswift bytes first and we don't want that.

I've been playing around sending anything different than the first 4 bytes as the network magic and test fail on my end (which is expected given data_received is called with magic_sent false on the first-byte mismatch).

Do you have any working example I can try to reproduce?

@stratospher
Copy link
Contributor Author

I've been playing around sending anything different than the first 4 bytes as the network magic and test fail on my end (which is expected given data_received is called with magic_sent false on the first-byte mismatch).

yes! that's expected behaviour and sending anything other than the first 4 bytes as network magic will make the test fail on master.

what i was trying to say was:

  1. on master
  • replacing return b"\xfa\xbf\xb5\xda" with any other 4 bytes will make the test fail and this is expected behaviour
  1. using this patch mentioned in test: fix intermittent failure in p2p_v2_earlykeyresponse #29352
  • replacing return b"\xfa\xbf\xb5\xda" with any other 4 bytes will make the test pass and this isn't behaviour we want
  • this happens because can_data_be_received variable is removed in the patch

so i was just trying to say that we can't remove can_data_be_received variable.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

test/functional/test_framework/test_node.py Outdated Show resolved Hide resolved
test/functional/p2p_v2_misbehaving.py Outdated Show resolved Hide resolved
test/functional/p2p_v2_misbehaving.py Outdated Show resolved Hide resolved
stratospher and others added 2 commits March 11, 2024 12:58
Early key response test is a special kind of test which requires
modified v2 handshake functions. More such tests can be added
where v2 handshake functions send incorrect garbage terminator,
excess garbage bytes etc.. Hence, rename p2p_v2_earlykey.py to a
general test file name - p2p_v2_misbehaving.py.

random_bitflip function (used in signature tests prior to this
commit) can be used in p2p_v2_misbehaving test to generate wrong
garbage terminator, wrong garbage bytes etc..
So, move the function to util.
Adds a new boolean parameter `expect_success` to the
`add_p2p_connection` method. If set, the node under
test doesn't wait for connection to be established
and is useful for testing scenarios when disconnection
is expected.

Without this parameter, intermittent test failures can happen
if the disconnection happens before wait_until for is_connected
is hit inside `add_p2p_connection`.

Co-Authored-By: Sebastian Falbesoner <sebastian.falbesoner@gmail.com>
@stratospher
Copy link
Contributor Author

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

hmm good point. will need to think about it more. simplest solution would be keeping it all in main implementation in v2_p2p.py haha but doesn't make sense to have it there just for this test file :(

@theStack
Copy link
Contributor

The one thing I'm a bit unsure about is that this duplicates quite a lot of code, especially in complete_handshake - the main implementation in v2_p2p.py and the overwritten version from this test could run out of sync in the future. But I don't have a good idea how to avoid this yet...

hmm good point. will need to think about it more. simplest solution would be keeping it all in main implementation in v2_p2p.py haha but doesn't make sense to have it there just for this test file :(

One idea to avoid at least reimplementing generate_keypair_and_garbage would be to allow setting a fixed garbage length with a new parameter (that is None by default, meaning that a random length will be picked), e.g:

diff --git a/test/functional/test_framework/v2_p2p.py b/test/functional/test_framework/v2_p2p.py
index 8f79623bd8..c27cd5e2fe 100644
--- a/test/functional/test_framework/v2_p2p.py
+++ b/test/functional/test_framework/v2_p2p.py
@@ -111,10 +111,11 @@ class EncryptedP2PState:
             # Responding, place their public key encoding first.
             return TaggedHash("bip324_ellswift_xonly_ecdh", ellswift_theirs + ellswift_ours + ecdh_point_x32)
 
-    def generate_keypair_and_garbage(self):
+    def generate_keypair_and_garbage(self, garbage_len=None):
         """Generates ellswift keypair and 4095 bytes garbage at max"""
         self.privkey_ours, self.ellswift_ours = ellswift_create()
-        garbage_len = random.randrange(MAX_GARBAGE_LEN + 1)
+        if garbage_len is None:
+            garbage_len = random.randrange(MAX_GARBAGE_LEN + 1)
         self.sent_garbage = random.randbytes(garbage_len)
         logger.debug(f"sending {garbage_len} bytes of garbage data")
         return self.ellswift_ours + self.sent_garbage

The overruled method of initiate_v2_handshake in the test could then call this with the parameter set if needed, e.g. for the EXCESS_GARBAGE case:

     def initiate_v2_handshake(self):
         if self.test_type == TestType.EARLY_KEY_RESPONSE:
             .....
         elif self.test_type == TestType.EXCESS_GARBAGE:
             # send > 4095 bytes garbage
             return self.generate_keypair_and_garbage(garbage_len=MAX_GARBAGE_LEN + random.randrange(1, 10))

(For the WRONG_GARBAGE test, bit-flipping self.sent_garbage right after calling generate_keypair_and_garbage should still be sufficient for causing a mismatch.)

@sr-gi
Copy link
Member

sr-gi commented May 6, 2024

I find the approach fairly hard to follow here. Having all the logic in the constructor with if/elses based on the connection type instead of having different constructors/a test for each type of failure feels really error-prone and difficult to make sure if a test is doing what we expect. I think it'd be better to have multiple classes that build from v2_p2p and modify what's needed, even if the file gets larger.

Just an example:

In the WRONG_GARBAGE_TERMINATOR case, we are modifying the gargabe in two ways. First, we are making it smaller than expected, but also, we are randomly flipping some of its bits. Turns out if you removed both changes, the tests still passes, but I'm not sure why. I would expect this not to be the case, since removing the relevant parts should make the class equal to the expected v2_p2p.

@stratospher
Copy link
Contributor Author

I think it'd be better to have multiple classes that build from v2_p2p and modify what's needed, even if the file gets larger.

that's a great suggestion @sr-gi! it's possible to design those classes in such a way that it avoids code duplication and the file is almost the same size!

Turns out if you removed both changes, the tests still passes, but I'm not sure why.

because we don't send a version message to ensure that the disconnection happens in the v2 handshake phase. in your case (different from what we are testing here), we send the correct garbage terminator but because we don't send a version message - disconnection is still expected.

Currently, we log the number of bytes of garbage when it is
generated. The log is a better fit for when the garbage
actually gets sent to the transport layer.
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25085973641

@stratospher
Copy link
Contributor Author

I've updated the PR to have different child class implementations for EncryptedP2PState based on the disconnection scenario we're testing - EarlyKeyResponseState, ExcessGarbageState etc...

Also introduced 2 more commits for cleaner code:

Copy link
Member

@sr-gi sr-gi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach much better, it is way easier to follow and reason about. I left some comments inline.

because we don't send a version message to ensure that the disconnection happens in the v2 handshake phase. in your case (different from what we are testing here), we send the correct garbage terminator but because we don't send a version message - disconnection is still expected.

What is weird is that the test is expecting a specific log message, which seems to also match in this case. This is still the case by the way. Is there an alternative way of making sure this does not happen?

test/functional/test_framework/v2_p2p.py Outdated Show resolved Hide resolved
Comment on lines 31 to 45
"""Send ellswift and garbage bytes in 2 parts when TestType = (EARLY_KEY_RESPONSE)"""
# Here, the 64 bytes ellswift is assumed to have it's first 4 bytes match network magic bytes.
# It is sent in 2 phases:
# 1. when `magic_sent` = False, send first 4 bytes of ellswift (matches network magic bytes)
# 2. when `magic_sent` = True, send remaining 60 bytes of ellswift
if not self.magic_sent:
self.generate_keypair_and_garbage()
self.magic_sent = True
return MAGIC_BYTES[self.net]
else:
# `can_data_be_received` is a variable used to assert if data is received on recvbuf.
# 1. v2 TestNode shouldn't respond back if we send V1_PREFIX and data shouldn't be received on recvbuf.
# This state is represented using `can_data_be_received` = False.
# 2. v2 TestNode responds back when mismatch from V1_PREFIX happens and data can be received on recvbuf.
# This state is represented using `can_data_be_received` = True.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 6d9df28

I still think this can be simplified. initiate_v2_handshake is going to be called, asynchronously, when creating this object. You can reduce a lot of the boilerplate, and get rid of magic_sent, as long as you do the last bit manually on the test (or just create another method, like continue_v2_handshake).

It just feels weird to me that this initialization method is called twice (one async, once manually), it doesn't feel too intuitive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this can be simplified. initiate_v2_handshake is going to be called, asynchronously, when creating this object. You can reduce a lot of the boilerplate, and get rid of magic_sent, as long as you do the last bit manually on the test (or just create another method, like continue_v2_handshake).

yes, that can be done since the last bit is being sent from MainThread anyways.

before:

  1. send both 4 bytes network magic (first 4 bytes of ellswift) - NetworkThread
  2. remaining ellswift_garbage bytes - MainThread

now:

  1. send both 4 bytes network magic (first 4 bytes of ellswift) - MainThread
  2. remaining ellswift_garbage bytes - MainThread

realised my earlier comment is the situation in master too and a sleep in MainThread where there's no waiting for some other tasks is unlikely. 😬

I've used your original suggestion and updated the PR to send everything manually!

Currently, transport version is a global variable declared as
TRANSPORT_VERSION in v2_p2p.py. Making it an instance variable
would help in sending non empty transport version packets for
testing purposes. It might also help EncryptedP2PState be more
extensible in far future protocol upgrades.
Prior to this commit, TestEncryptedP2PState would always
send initial_v2_handshake bytes in 2 parts (as required
by early key response test).

For generalising this test and having different v2 handshake
behaviour based on the test type, special behaviours like
sending initial_v2_handshake bytes in 2 parts are executed
only if test_type is set to EARLY_KEY_RESPONSE.
This test type is represented using EXCESS_GARBAGE.
…is sent

This test type is represented using WRONG_GARBAGE_TERMINATOR.
since the wrong garbage terminator is sent to TestNode, TestNode
will interpret all of the gabage bytes, wrong garbage terminator,
decoy messages and version packet it receives as garbage bytes.

If the length of all these is more than 4095 + 16, it will result
in a missing garbage terminator error. otherwise, it will result
in a V2 handshake timeout error.

Send only MAX_GARBAGE_LEN//2 bytes of garbage data to TestNode
so that the total length received by the TestNode is at max
= (MAX_GARBAGE_LEN//2) + 16 + 10*120 + 20 = 3283 bytes
(which is less than 4095 + 16 bytes) and we get a consistent
V2 handshake timeout error message.

If we do not limit the garbage length sent, we will intermittently
get both missing garbage terminator error and V2 handshake
timeout error based on the garbage length and decoy packets length
which are chosen at random.
… different

This test type is represented using WRONG_GARBAGE.
Here, garbage bytes sent to TestNode are assumed to be tampered with and
do not correspond to the garbage bytes which P2PInterface calculated and
uses.
This test type is represented using SEND_NO_AAD. If AAD of the first encrypted packet
sent after the garbage terminator (optional decoy packet/version packet) hasn't been
filled, disconnection happens.
…tion happens

This test type is represented using SEND_NON_EMPTY_VERSION_PACKET.
@stratospher
Copy link
Contributor Author

thank you for the helpful reviews @sr-gi!

I've updated the PR:

  • EARLY_KEY_RESPONSE test doesn't have magic_sent now and the initial_v2_handshake() function is cleaner
  • EARLY_KEY_RESPONSE test does sending of 4 bytes ellswift (which match network magic) + remaining ellswift and garbage bytes on MainThread - both are done manually now
  • Added v2 handshake timeout log in 7d07daa

What is weird is that the test is expecting a specific log message, which seems to also match in this case. This is still the case by the way. Is there an alternative way of making sure this does not happen?

hmm fair point, i've added a log message in InactivityCheck() to distinguish v2 handshake timeout case from version handshake timeout.

Copy link
Member

@sr-gi sr-gi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it now 🚀

ACK ae528cb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants