Cache the mapping of sequence to log block index in transaction log iterator #12538
+156
−15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context/Summary:
The official documentation recommends using the GetUpdatesSince method for master-slave synchronization. Every time this method called, it will search for the start sequence from the beginning of the target WAL log. It may costs too much to seek to start sequence when wal file size is large. If high requirements are placed on master-slave delay, SeekToStartSequence() may be called very frequently and cause a large amount of read IO (although in most cases read page cache).
There are some issues discussing about it:
#12516
#889
In most scenarios, the start sequences of the GetUpdatesSince method are continuous. Caching the previously seek position can avoid most repeated readings of the wal file.
This commit will cache the mapping of sequence to WAL file block index. It does this both after seeking to the start sequence of the iterator and after iterating to the end of the iterator. When seeking to start sequence, the log reader will lookup the first batch sequence that is not larger than the target sequence and then skip wal file read pointer to the start of the block, instead of seeking from the start.
The optimization effect on the getupdatesSince interface is quite obvious
Test:
covered by ./db_log_iter_test