Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: send up-to-date commit index in heartbeats #140

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pav-kv
Copy link
Contributor

@pav-kv pav-kv commented Jan 26, 2024

Fixes #138

This change makes the commit index advancement in handleHeartbeat safe.
Previously, a follower would attempt to update the commit index to
whichever was sent in the MsgHeartbeat message. Out-of-bound indices
would crash the node.

It is always safe to advance a commit index if the follower's log is "in
sync" with the leader, i.e. when its log is guaranteed to be a prefix of
the leader's log. This becomes true when the first MsgApp append message
succeeds.

At the moment, the leader will never send a commit index that exceeds
the follower's log size. However, this may change in future. This change
is a defence-in-depth.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
This commit adds a test demonstrating the effect of delayed commit on a
follower node after a network hiccup between the leader and this
follower.

In the described scenario, after the moment of committing an entry on
the leader, it takes HeartbeatInterval + 3/2 * RTT until the follower
learns this entry is committed.

This is suboptimal, and could take HeartbeatInverval + 1/2 * RTT if the
leader didn't cut the commit index at Progress.Match before sending it
to the follower.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
TODO: describe why it is now safe

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lagging follower commit after message drops
2 participants