Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Message level mechanism for disabling proposal forwarding #73

Open
mitake opened this issue Jun 5, 2023 · 0 comments
Open

RFC: Message level mechanism for disabling proposal forwarding #73

mitake opened this issue Jun 5, 2023 · 0 comments

Comments

@mitake
Copy link
Contributor

mitake commented Jun 5, 2023

The Raft library provides the parameter DisableProposalForwarding. When the flag is true, it doesn’t forward MsgProp type messages from follower to leader. I guess providing a similar option for each Message might be valuable. It’s great if I can get comments on this idea.

Context: a program which uses the Raft library can have asynchronous processes which can issue Raft request from a server process. In the case of etcd, lease and compaction are typical examples. I guess other users of the library might have similar processes.

Such an asynchronous process can be implemented a goroutine whose behavior depends a condition that its node is leader or not. If the node is a leader, the goroutine issues Raft messages asynchronously (and periodically).

This behavior might be problematic in some cases. If the goroutine can be paused by various reasons (high load on CPU, disk I/O, etc), the node can be a follower by a new leader election. The problem is that the goroutine can behave based on a stale information that the node is still a leader. In such a case, the goroutine can issue Raft requests because it thinks that it’s still a leader. If the messages shouldn’t be duplicated, it might be harmful. Otherwise it will be problematic. In the case of etcd, it can cause lease revoking from a stale leader: etcd-io/etcd#15247

Note that this problem should happen only if the Raft library logic recognizes itself as a follower:

  • If the Raft library logic recognizes itself as a leader: MsgProp messages will be handled by the stale node itself and result MsgApp. In this case other nodes can reject the message because these messages have an old term.
  • If the Raft library logic recognizes itself as a follower already: MsgProp will be forwarded to a new leader and the new leader will send MsgApp. Although the original source of the messages is the stale leader, the new cluster can accept the messages.

(the above behavior is quite subtle and I'm still checking it, I'm glad if I can get feedback on it too)

I guess setting DisableProposalForwarding true will be a simple solution for avoiding this situation. However, if a program which uses the Raft library doesn’t provide a client side mechanism of selecting a leader and sending messages to it (e.g. etcd clientv3), the parameter will make the program not functional because it affects all messages. So I think it’s nice if the Raft library can provide a mechanism to disable proposal forwarding only for specific message types.

There might be some possible approaches:

  • Adding a new flag to Message: it’s simple but be too large change.
  • Adding a callback mechanism for judging a message should avoid proposal forwarding if a node is follower or not. In the case of etcd, etcd can supply a callback which drops lease or compaction related Raft requests.

I’d like to know other people’s opinions and how other programs which use the Raft library deal with this kind of issue.

cc @ahrtr @serathius @tbg @pavelkalinnikov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant