Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NRG (2.11): Ignore vote requests if leader heard more recently than election timeout #5331

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

neilalexander
Copy link
Member

This is actually a safeguard that the Raft paper describes as a way of avoiding network partitioned nodes from coming back up with a high term number and causing the existing leader to step down unnecessarily. In the time that the isolated node was isolated, it is likely to have sat timing out constantly and increasing its term number with each new election attempt.

Section 6 "Cluster membership changes":

The third issue is that removed servers (those not in Cnew) can disrupt the cluster. These servers will not receive heartbeats, so they will time out and start new elections. They will then send RequestVote RPCs with new term numbers, and this will cause the current leader to revert to follower state. A new leader will eventually be elected, but the removed servers will time out again and the process will repeat, resulting in poor availability.

To prevent this problem, servers disregard RequestVote RPCs when they believe a current leader exists. Specifically, if a server receives a RequestVote RPC within the minimum election timeout of hearing from a current leader, it does not update its term or grant its vote. This does not affect normal elections, where each server waits at least a minimum election timeout before starting an election. However, it helps avoid disruptions from removed servers: if a leader is able to get heartbeats to its cluster, then it will not be deposed by larger term numbers.

TODO: the TestNRGSimpleElection test fails with this, need to investigate if this has changed step-down behaviour.

Signed-off-by: Neil Twigg neil@nats.io

@@ -256,7 +256,7 @@ func TestNRGSimpleElection(t *testing.T) {
require_NoError(t, rg.leader().node().StepDown())

// Wait for a vote request to come in.
msg := require_ChanRead(t, voteReqs, time.Second)
msg := require_ChanRead(t, voteReqs, time.Second*3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg := require_ChanRead(t, voteReqs, time.Second*3)
msg := require_ChanRead(t, voteReqs, 3*time.Second)

@mprimi
Copy link
Contributor

mprimi commented Apr 22, 2024

I'd suggest the following logic (very close to what you did, but not exactly the same)

New state variable: leaderLastAETimestamp

Every time you get an AE from the node you think is leader:
leaderLastAETimestamp = time.Now()

When you get a vote request (regardless of the state you are in):
if [time.Since(leaderLastAETimestamp) < some threshold] then ignore request

(the conditional above may need a special case for voluntary leader transitions)

@neilalexander neilalexander force-pushed the neil/nrgvoterequestbeforeelectiontimeout branch from 2f60796 to 43869fc Compare April 24, 2024 09:25
@neilalexander neilalexander changed the title NRG: Ignore vote requests if leader heard more recently than election timeout NRG (2.11): Ignore vote requests if leader heard more recently than election timeout Apr 24, 2024
… timeout

This is actually a safeguard that the Raft paper prescribes as a way of avoiding
network partitioned nodes from coming back up with a high term number and causing
the existing leader to step down unnecessarily.

Signed-off-by: Neil Twigg <neil@nats.io>
@neilalexander neilalexander force-pushed the neil/nrgvoterequestbeforeelectiontimeout branch from 43869fc to 786711e Compare April 24, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants