You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Relevant code that is using NuRaft here is related to adding servers, we do:
srv_config const srv_config_to_add(static_cast<int>(config.coordinator_id), endpoint);
auto cmd_result = raft_server_->add_srv(srv_config_to_add);
if (cmd_result->get_result_code() == nuraft::cmd_result_code::OK) {
spdlog::info("Request to add server {} to the cluster accepted", endpoint);
} else {
throw RaftAddServerException("Failed to accept request to add server {} to the cluster with error code {}",
endpoint, int(cmd_result->get_result_code()));
}
// Waiting for server to join
constexpr int max_tries{10};
auto maybe_stop = utils::ResettableCounter<max_tries>();
constexpr int waiting_period{200};
bool added{false};
while (!maybe_stop()) {
std::this_thread::sleep_for(std::chrono::milliseconds(waiting_period));
const auto server_config = raft_server_->get_srv_config(static_cast<nuraft::int32>(config.coordinator_id));
if (server_config) {
spdlog::trace("Server with id {} added to cluster", config.coordinator_id);
added = true;
break;
}
}
if (!added) {
throw RaftAddServerException("Failed to add server {} to the cluster in {}ms", endpoint,
max_tries * waiting_period);
}
The text was updated successfully, but these errors were encountered:
From what you provided, it seems there is a race between srv_to_join_.reset() and peer::lock_. However, that is not possible as peer::handle_rpc_result has myself not to make the reference counter of the shared pointer become 0. I haven't seen TSAN alert during the normal server add/removal process.
Can you please provide the exact commit hash of NuRaft that you used?
Also, do you get this TSAN alert every time you run the above code? If so, it will be great to share the logs generated by NuRaft to see what happened.
I tried the same code with the same NuRaft version, still didn't get TSAN alert.
Could you share your build environment? Such as OS version, compiler version, etc. Also please share Raft parameters (heartbeat, election lower/upper timeout, ...).
I found data race using TSAN in my code. The issue seems to be in peer.
Relevant code that is using NuRaft here is related to adding servers, we do:
The text was updated successfully, but these errors were encountered: