Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimming most recent timestamp history #781

Open
peroket opened this issue Nov 27, 2023 · 0 comments
Open

Trimming most recent timestamp history #781

peroket opened this issue Nov 27, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@peroket
Copy link

peroket commented Nov 27, 2023

Owner:

Is your feature request related to a problem? Please describe.
When using timestamps, we have a use case where we have to remove the most recent history (the keys with timestamps above a threshold). The key should not be marked as deleted, but everything should happen as if the version above the threshold were never inserted (making the values with lower timestamp visible again). And this has to be able to be done atomically with other operations to keep a consistent view of the DB while it's used in multiple threads (the writebatch is ideal for that).

// Create database with timestamp
rocksdb::DB * database = ...;

// Add history
database->Put(rocksdb::WriteOptions(), "key", "timestamp1", "value1");
database->Put(rocksdb::WriteOptions(), "key", "timestamp2", "value2");

// Reading back with newer timestamp gives us the most recent value
database->Get(rocksdb::ReadOptions{.timestamp = "timestamp3"}, "key"); // == "value2"

// Delete the last entry, but don't delete the key itself (wanted feature)
database->DeleteAt("key", "timestamp2");

// After that the previous value becomes visible again
database->Get(rocksdb::ReadOptions{.timestamp = "timestamp3"}, "key"); // == "value1"

Describe the solution you'd like
I want to be able to delete a specific set of keys + timestamps in a write batch (to be able to atomically delete several such combinations), but only deleting the specific history, not marking the key itself as being deleted.
It's also fine if it's a call in the write batch to delete all key with timestamps above or equal a certain value in timestamp enabled columns, but I need to be able to do regular operations on columns without timestamp at the same time.

Describe alternatives you've considered
As far as I know, there is no way to do that. We can't stop the database to call OpenAndTrimHistory. The solution we use for now is to copy the values of previous timestamp to newer timestamp, but that leads to side effects that need to be handled (the versions are not encoding changes anymore, but could have duplicates, and we have values written for timestamp above our current "official" timestamp, which can lead to hard to track bugs -- what if i "erase" my history from timestamp 3 to timestamp 1, then write at timestamp 2 → my value at timestamp 3 and above will not be what I expect -- If I use delete instead I have a different set of issues). Which also leads to wasted disk space.

@bosmatt bosmatt added the enhancement New feature or request label Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants