Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#20234] DocDB: Improved metacache to refresh using TabletConsensusIn…
…fo when received NOT_THE_LEADER error Summary: Problem Background: In our system, when a client needs to perform an operation on a specific tablet, it first needs to find out which server is currently responsible for that operation. If the operation is a WriteRpc for example, it must find the tablet leader server. However, the system's current method of figuring out the tablet leader is not very efficient. It tries to guess the leader based on a list of potential servers (peers), but this guessing game can be slow, especially when there are many servers or when the servers are located far apart geographically. This inefficiency can lead to operations failing because the leader wasn't found quickly enough. Additionally, the system doesn't handle server failures well. If a server is down, it might take a long time for the system to stop trying to connect to it, wasting valuable seconds on each attempt. While there's a mechanism to avoid retrying a failed server for 60 seconds, it's not very effective when a server is permanently out of service. One reason for this inefficiency is that the system's information about who the leaders are (stored in something called the meta cache) can become outdated, and it doesn't get updated if the system can still perform its tasks with the outdated information, even if doing so results in repeated connection failures. Solution Introduction: This ticket introduces a preliminary change aimed at improving how the system tracks the current leader for each piece of data. The idea is to add a new piece of information to the meta cache called "raft_config_opid," which records the latest confirmed leadership configuration for each tablet. This way, when the system receives new information about the leadership configuration (which can happen during normal operations from other servers), it can check this new information against what it already knows. If the new information is more up-to-date, the system can update its meta cache, potentially avoiding wasted efforts on trying to connect to servers that are no longer leaders or are down. This diff, combined with D33197 and D33598, updates the meta-cache using TabletConsensusInfo that is piggybacked by a Write/Read/GetChanges/GetTransactionStatus ResponsePB when we sent a request to a non-leader but requires a leader to receive our request. These frequent RPC requests should be able to keep our meta-cache sufficiently up to date to avoid the situation that caused the CE. Upgrade/Rollback safety: The added field in the ResponsePBs is not to be persisted on disk, it is guarded by protobuf's backward compatibility Jira: DB-9194 Test Plan: Unit Testing: 1. ClientTest.TestMetacacheRefreshWhenSentToWrongLeader: Changes leadership of a RaftGroup after meta-cache is already filled in. This introduces a discrepancy between the information available in the meta-cache and the actual cluster configuration, and should return back a NOT_THE_LEADER error for our caller. Normally, this will prompt the TabletInvoker to try the next-in-line replica's Tablet Server, and using our test set up, this will guarantee that the TabletInvoker will retry the RPC at least 3 times. However, because this diff introduces the mechanism to refresh the meta-cache right away after a NOT_THE_LEADER error, we should observe that the RPC will succeed in 2 tries instead of one, the first attempt will piggyback the TabletConsensusInfo and update the meta-cache, while the other attempt will use that newest meta-cache and find the correct leader to send the request to. 2. CDCServiceTestMultipleServersOneTablet.TestGetChangesRpcTabletConsensusInfo: Since the GetChanges code path for updating meta-cache is sufficiently diverged from other RPC types, this test is introduced to explicitly check that when a cdc proxy receives a not the leader error message, its meta-cache should be refreshed. Reviewers: mlillibridge, xCluster, hsunder Reviewed By: mlillibridge Subscribers: yql, jason, ycdcxcluster, hsunder, ybase, bogdan Differential Revision: https://phorge.dev.yugabyte.com/D33533
- Loading branch information
1 parent
ee9321d
commit 942de8f
Showing
38 changed files
with
901 additions
and
117 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.