Take into account network latency when syncing #55

heifner · 2022-03-13T00:33:21Z

Take into account network latency when syncing from a node to avoid getting stuck in an always lib catchup state.

Port of: sync to chain in high latency net EOSIO/eos#11078
Does not enable the new test (p2p_high_latency_test.py) as it requires either iproute-tc or iproute2 installed depending on platform.

Co-authored-by: Farhad Shahabi farhad.shahabi@block.one

…etting stuck in an always lib catchup state. Co-authored-by: Farhad Shahabi <farhad.shahabi@block.one>

plugins/net_plugin/include/eosio/net_plugin/protocol.hpp

swatanabe · 2022-03-14T11:57:45Z

plugins/net_plugin/net_plugin.cpp

@@ -1642,15 +1644,25 @@ namespace eosio {

      sync_reset_lib_num(c);

+      auto current_time_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
+      auto network_latency_ns = current_time_ns - msg.time; // net latency in nanoseconds


What if this is negative?

Negative would mean time skew between the nodes, should just make it 0 if < 0 I guess.

That makes sense if the clock skew is known to be small... but you removed the check for skew. If the clock skew is close to the latency, then one side will see double latency and the other will see 0 latency.

The check for skew that was removed never worked (see comment I just added to PR for that section of code). I'm open for suggestions on alternatives, but I don't think there is any way to improve that, right?

It would have to be based on RTT (which can be measured independent of clock skew) rather than one-way latency.

Yes, that would work, but require a new protocol version and RT message.

I think it's probably fine to assume low clock skew for now, since we've survived so long without a working check for clock skew. This PR doesn't make things worse in that regard.

heifner · 2022-03-14T16:31:19Z

plugins/net_plugin/net_plugin.cpp

-                  ("peer", msg.p2p_address)("time", "1 second")); // TODO Add to_variant for std::chrono::system_clock::duration
-         return false;
-      }
-


Adding note to this PR here for future documentation of why this was removed. Removed this code because it could never have worked. time is in microseconds where msg.time is in nanoseconds so time - msg_time is always negative.

Also there is no way to do what this was trying to do. You don't know how much network latency is involved so you have no idea what clock skew is involved.

brianjohnson5972 · 2022-03-15T15:35:32Z

plugins/net_plugin/net_plugin.cpp

+      }
+      // number of blocks syncing node is behind from a peer node
+      uint32_t nblk_behind_by_net_latency = static_cast<uint32_t>(network_latency_ns / block_interval_ns);
+      // Multiplied by 2 to compensate the time it takes for message to reach peer node, and plus 1 to compensate for integer division truncation


I think if we change it to "to reach back to that peer node" I think the 2 times will be clearer.

Take into account network latency when syncing from a node to avoid g…

f57b9a8

…etting stuck in an always lib catchup state. Co-authored-by: Farhad Shahabi <farhad.shahabi@block.one>

heifner requested a review from brianjohnson5972 March 13, 2022 00:34

Fix overflow bug introduced in port of original eosio/eos PR

4c4689c

swatanabe reviewed Mar 14, 2022

View reviewed changes

heifner added 3 commits March 14, 2022 07:47

use int64_t instead of long long

7790347

Handle negative network latency (clock skew)

80bb777

Merge remote-tracking branch 'origin/main' into fsh-sync-to-chain

8c3f475

heifner requested a review from swatanabe March 14, 2022 12:50

heifner commented Mar 14, 2022

View reviewed changes

brianjohnson5972 suggested changes Mar 15, 2022

View reviewed changes

Clarify comment

3d6fc91

brianjohnson5972 approved these changes Mar 15, 2022

View reviewed changes

heifner merged commit 54d4286 into main Mar 16, 2022

heifner deleted the fsh-sync-to-chain branch March 16, 2022 12:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take into account network latency when syncing #55

Take into account network latency when syncing #55

heifner commented Mar 13, 2022

swatanabe Mar 14, 2022

heifner Mar 14, 2022

swatanabe Mar 14, 2022

heifner Mar 14, 2022

swatanabe Mar 14, 2022

heifner Mar 14, 2022

swatanabe Mar 14, 2022

heifner Mar 14, 2022

brianjohnson5972 Mar 15, 2022

heifner Mar 15, 2022

Take into account network latency when syncing #55

Take into account network latency when syncing #55

Conversation

heifner commented Mar 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment