Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement obsolete file deletion (GC) in follower #12657

Closed
wants to merge 3 commits into from

Conversation

anand1976
Copy link
Contributor

@anand1976 anand1976 commented May 14, 2024

This PR implements deletion of obsolete files in a follower RocksDB instance. The follower tails the leader's MANIFEST and creates links to newly added SST files. These links need to be deleted once those files become obsolete in order to reclaim space. There are three cases to be considered -

  1. New files added and links created, but the Version could not be installed due to some missing files. Those links need to be preserved so a subsequent catch up attempt can succeed. We insert the next file number in the VersionSet to pending_outputs_ to prevent their deletion.
  2. Files deleted from the previous successfully installed Version. These are deleted as usual in PurgeObsoleteFiles.
  3. New files added by a VersionEdit and deleted by a subsequent VersionEdit, both processed in the same catchup attempt. Links will be created for the new files when verifying a candidate Version. Those need to be deleted explicitly as they're never added to VersionStorageInfo, and thus not deleted by PurgeObsoleteFiles.

Test plan -
New unit tests in db_follower_test.

Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the change. It looks great, just some nit comments.

db/version_set.h Show resolved Hide resolved
db/version_edit_handler.cc Outdated Show resolved Hide resolved
db/version_edit_handler.cc Outdated Show resolved Hide resolved
db/version_set.cc Outdated Show resolved Hide resolved
Copy link
Contributor Author

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick review!

db/version_edit_handler.cc Outdated Show resolved Hide resolved
db/version_edit_handler.cc Outdated Show resolved Hide resolved
db/version_set.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@anand1976 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

JobContext purge_files_job_context(0);
{
InstrumentedMutexLock lock_guard(&mutex_);
// Currently, secondary instance does not own the database files, thus it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it seems like this comment needs update.

}
follower_filenums.insert(test::GetFileNumber(name));
}
db_filenums.merge(follower_filenums);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is checking that the files on the follower is a superset of the files on the leader. Should the relationship be reversed? Since some intermediate files that may still exist on leader are not gonna be linked (or strictly speaking, linked and immediately removed) when follower jump forward and just installs the newest Version. In other cases where either the follower fails to install a Version or installs as many Versions as the leader do. Since follower can only link a file that's already present on the leader, so leader's files should be also a superset of the follower's files too. Is that right?

In order to check follower correctly cleans up obsolete files, maybe checking this relationship holds is not sufficient. Would it make sense for this verification method to take the current storage info of the follower and verify the directory contains no more and no less file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking that the leader and follower files are identical. std::set::merge will move any element in follower_filenums that's not in db_filenums. Let's say either or both directories have some unique files, after the merge db_filenums will be a superset of follower_filenums. If they're equal in size, then the files in both are identical, since otherwise merge would have moved the different to db_filenums.

When CheckDirs() is called, both leader and follower are expected to be in their respective final states, so there shouldn't be any intermediate or obsolete files on the leader. Let's say the follower doesn't delete an intermediate file due to a bug, then it wouldn't be present in the leader directory and this check would fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks for the detailed explanation. I didn't notice how merge is also updating follower_filenums. Checking them to be exactly identical makes sense.

// This test creates 4 L0 files, immediately followed by a compaction to L1.
// The follower replays the 4 flush records from the MANIFEST unsuccessfully,
// and then successfully recovers a Version from the compaction record
TEST_F(DBFollowerTest, RetryCatchup) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and a few other tests currently is not CheckDirs, should it be added to all of them?

ASSERT_OK(Put("k1", "v4"));
ASSERT_OK(Flush());

// FlushOptions fo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:Are these commented out test lines outdated?

Reopen(opts);
ASSERT_OK(OpenAsFollower());

follower_fs()->BarrierInit(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite get what these Barrier* functions helped to achieve, would you mind adding a bit clarification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I assume you're asking about this particular test rather than the Barrier* functions in general. So in this test, they ensure that the follower sees the next 4 L0 files and is able to link them, and then sees the compaction that obsoletes those L0 files (so those L0 files are intermediates that it has to explicitly delete). Suppose we don't have any barriers, its possible the follower reads the L0 records and compaction records from the MANIFEST in one read, which means those L0 files would have already been deleted by the leader and the follower cannot link to them.

Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! LGTM

@facebook-github-bot
Copy link
Contributor

@anand1976 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@anand1976 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@anand1976 merged this pull request in 0ed9355.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants