-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WT-12096 Don't validate timestamp against the on-page value if the update is globally visible #10595
Conversation
@tod-johnson-mongodb do you want me to run this branch against the format reproducer I've used so far? @sueloverso github won't let me add a reply. So I am editing the comment. The error message from not using the v4 toolchain on file wiredtiger/bench/workgen/workgen.c |
My build on kodkod-aws completes. If there is a failure in building I don't see it and the process exits with success (unlike say a compilation error). I have started the format run. |
This is not code that I know well enough to say this is the correct fix. So I removed myself as reviewer and added @luke-pearson. I will say that my repro that readily failed has run for the last 4 days, over 2000 iterations, which is far longer than any of the failure runs took to hit. So that is a positive. |
The test failure in PR testing should be investigated to see if it is related to this change/fix since the test has to do with updates and tombstones (based on the name). |
This is a reconciliation change that should have a lot of eyes on it, I've added Chenhao. |
src/reconcile/rec_visibility.c
Outdated
if (prev_upd->start_ts < vpack->tw.start_ts || | ||
(WT_TIME_WINDOW_HAS_STOP(&vpack->tw) && prev_upd->start_ts < vpack->tw.stop_ts)) { | ||
/* | ||
* If prev_upd is globally visible it can be evicted. Thus this check applies only if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evicted is not the right word. I think it should be the following updates can be removed by obsolete check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is evicted not the right word? That is the word used throughout this function. I will talk with you about this.
381 * If eviction reconciliation starts before checkpoint, it is fine to evict updates without
382 * timestamps.
389 WT_STAT_CONN_DATA_INCR(session, cache_eviction_blocked_remove_hs_race_with_checkpoint);
401 WT_STAT_CONN_DATA_INCR(session, cache_eviction_blocked_no_ts_checkpoint_race_2);
433 WT_STAT_CONN_DATA_INCR(session, cache_eviction_blocked_no_ts_checkpoint_race_4);
475 * If prev_upd is globally visible it can be evicted. Thus this check applies only if
476 * prev_upd is not globally visible.
482 WT_STAT_CONN_DATA_INCR(session, cache_eviction_blocked_no_ts_checkpoint_race_1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obsolete check may happen at any time not necessarily in eviction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rollback to stable may recover updates from the history store that is out of order to the on-disk value. Normally these updates have the WT_UPDATE_RESTORED_FROM_HS flag on them. However, in rare cases, if the newer update becomes globally visible, the restored update may be removed by the obsolete check. This may lead to an out of order edge case but it is benign. Check the global visibility of the update and ignore this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/reconcile/rec_visibility.c
Outdated
@@ -471,8 +471,13 @@ __rec_validate_upd_chain(WT_SESSION_IMPL *session, WT_RECONCILE *r, WT_UPDATE *s | |||
prev_upd->start_ts == prev_upd->durable_ts || !WT_TIME_WINDOW_HAS_STOP(&vpack->tw) || | |||
prev_upd->durable_ts >= vpack->tw.durable_stop_ts, | |||
"Durable timestamps cannot be out of order for prepared updates"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me and tod discussed about changing these assert as
if (WT_TIME_WINDOW_HAS_STOP(&vpack->tw))
...
else
...
This improves readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done.
-
When we discussed this I misread this. The first WT_ASSERT_ALWAYS() is checking vpack->tw.durable_start_ts. The second WT_ASSERT_ALWAYS() is checking vpack->tw.durable_stop_ts.
-
If WT_ASSERT_ALWAYS() fails the assert condition is printed. Moving part of the condition out of the WT_ASSERT_ALWAYS() prints just part of the assert condition if the assert fails.
I did change the asserts to have different comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part that is moved out of the assert is not important.
src/reconcile/rec_visibility.c
Outdated
"Durable timestamps cannot be out of order for prepared updates"); | ||
if (prev_upd->start_ts < vpack->tw.start_ts || | ||
(WT_TIME_WINDOW_HAS_STOP(&vpack->tw) && prev_upd->start_ts < vpack->tw.stop_ts)) { | ||
if (prev_upd->prepare_state == WT_PREPARE_INPROGRESS || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we agree to do
if (WT_TIME_WINDOW_HAS_STOP(&vpack->tw))
WT_ASSERT_ALWAYS(session,
prev_upd->prepare_state == WT_PREPARE_INPROGRESS || prev_upd->start_ts == prev_upd->durable_ts ||
prev_upd->durable_ts >= vpack->tw.durable_stop_ts,
"Stop: Durable timestamps cannot be out of order for prepared updates");
else
WT_ASSERT_ALWAYS(session, prev_upd->prepare_state == WT_PREPARE_INPROGRESS || prev_upd->start_ts == prev_upd->durable_ts || prev_upd->durable_ts >= vpack->tw.durable_start_ts,
"Start: Durable timestamps cannot be out of order for prepared updates");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be best?
if (prev_upd->prepare_state != WT_PREPARE_INPROGRESS && prev_upd->start_ts != prev_upd->durable_ts) {
WT_ASSERT_ALWAYS(session, prev_upd->durable_ts >= vpack->tw.durable_start_ts && (!WT_TIME_WINDOW_HAS_STOP(&vpack->tw) || prev_upd->durable_ts >= vpack->tw.durable_stop_ts));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer my version because I think the code is clearer in that way. We compare the stop time pair if on the disk image we have one. Otherwise, we compare with the start time pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay fair enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This solution skips comparing the last update in the update chain with the on disk update if the last update is globally visible. This ignores the fact that it may have already walked obsolete updates prior to that. Would it not be better if the loop on line 420 became:
The only concern I have with this solution is that that globally visible update could still be OOO with respect to the on disk update but it shouldn't matter as its globally visible? |
This can work because we are here only in eviction. My concern is that checking global visibility for each update is expensive.
This is OK and it is exactly the exception case that is false positive. |
I suppose that is true, we can definitely reduce the number of checks by keeping only one... I guess for performance reasons we'll only do one check but it should be detailed in a comment why we chose to do it that way @tod-johnson-mongodb . |
a742934
to
e8fb849
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
6fde309
to
26abd29
Compare
f389f62
to
34b2389
Compare
…visible. * Fix EBUSY check and add an explaining comment. * Improve two WT_ASSERT_ALWAYS() preceeding the EBUSY check. * Add a test program.
34b2389
to
3a25300
Compare
Test coverage is too low, this change probably shouldn't be merged as-is.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@tod-johnson-mongodb Are you ready to merge this? |
I am merging this. I have two approvals. Sue said she didn't need to re-review since I already had two approvals. I didn't hear whether Sean wanted to re-review. |
If an update is globally visible it can be evicted and EBUSY should not be returned. Thus the check for EBUSY should be skipped if an update is globally visible. The bug is the check for EBUSY was not skipped if an update is globally visible.
Please review just all commits. There were temporary changes that have been backed out.