We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
一个正常工作的 1c2g 3节点组集群,应该是leader跟follower之间网络突然抖了一下,然后就发生了异常无法自愈
Leader节点(0-0节点)观察到的日志 这个现象过后就是一直拒绝follwer 0-1的preVote请求了
异常Follower(0-1节点)上观察的日志 可以看到0-1跟0-0 leader断了一下后面又重新连上了,但是就陷入了preVote的死循环,但这个时候的leader仍然是0-0,用arthas在0-1 follower上抓这个请求已经没有了,leader上也确认了没有给0-1发AppendEntry,导致0-1一直没法重新正常加到group里
watch com.alipay.sofa.jraft.core.NodeImpl handleAppendEntriesRequest
猜测是不是因为leader上这个uncaught exception导致了心跳的调度任务终止跳出了,而下次的心跳调度的触发又依赖follower的com.alipay.sofa.jraft.core.Replicator#onHeartbeatReturned, @fengjiachun 大佬有没有什么建议,求指教
java -version
uname -a
The text was updated successfully, but these errors were encountered:
kill -s SIGUSR2 pid
https://www.sofastack.tech/projects/sofa-jraft/jraft-user-guide/
参考第11小节,每个节点会产生三个文件 node_metrics, thread_pool_metrics, node_describe
请发下每个节点的这三个文件,建议文本,不要截图
Sorry, something went wrong.
No branches or pull requests
Describe the bug
一个正常工作的 1c2g 3节点组集群,应该是leader跟follower之间网络突然抖了一下,然后就发生了异常无法自愈
Leader节点(0-0节点)观察到的日志
这个现象过后就是一直拒绝follwer 0-1的preVote请求了
异常Follower(0-1节点)上观察的日志
可以看到0-1跟0-0 leader断了一下后面又重新连上了,但是就陷入了preVote的死循环,但这个时候的leader仍然是0-0,用arthas在0-1 follower上抓这个请求已经没有了,leader上也确认了没有给0-1发AppendEntry,导致0-1一直没法重新正常加到group里
猜测是不是因为leader上这个uncaught exception导致了心跳的调度任务终止跳出了,而下次的心跳调度的触发又依赖follower的com.alipay.sofa.jraft.core.Replicator#onHeartbeatReturned, @fengjiachun 大佬有没有什么建议,求指教
Expected behavior
Actual behavior
Steps to reproduce
Minimal yet complete reproducer code (or GitHub URL to code)
Environment
java -version
): 8uname -a
): Linux 4.19.91-24.1.al7.x86_64The text was updated successfully, but these errors were encountered: