-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WT-12861 consolidate cache config validation into a single function #10548
base: develop
Are you sure you want to change the base?
WT-12861 consolidate cache config validation into a single function #10548
Conversation
aaa0d7c
to
fb6ec16
Compare
Test coverage is too low, this change probably shouldn't be merged as-is.
|
…g-validation-into-a-single-function' into wt-12861-consolidate-cache-config-validation-into-a-single-function
…ion-into-a-single-function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! Left a few comments.
src/conn/conn_cache.c
Outdated
if (cache->eviction_dirty_target > cache->eviction_target) { | ||
if (debug_config) | ||
__wt_verbose_warning(session, WT_VERB_CONFIGURATION, | ||
"config eviction dirty target=%f is larger than eviction target=%f", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we say the new value we set it to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added in: d2ff734
src/conn/conn_cache.c
Outdated
@@ -419,3 +463,13 @@ __wti_cache_destroy(WT_SESSION_IMPL *session) | |||
__wt_free(session, conn->cache); | |||
return (ret); | |||
} | |||
|
|||
#ifdef HAVE_UNITTEST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was for the unit test, now removed: 820f876
src/include/extern.h
Outdated
@@ -2601,6 +2601,8 @@ extern int __ut_ovfl_discard_wrapup(WT_SESSION_IMPL *session, WT_PAGE *page) | |||
WT_GCC_FUNC_DECL_ATTRIBUTE((warn_unused_result)); | |||
extern int __ut_ovfl_track_init(WT_SESSION_IMPL *session, WT_PAGE *page) | |||
WT_GCC_FUNC_DECL_ATTRIBUTE((warn_unused_result)); | |||
extern int __ut_validate_cache_config(WT_SESSION_IMPL *session, const char *cfg[], bool create) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed here: 820f876
src/conn/conn_cache.c
Outdated
* Validate trigger and target values of given configs. | ||
*/ | ||
static int | ||
__validate_cache_config(WT_SESSION_IMPL *session, const char *cfg[], bool create) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For naming, the static function should start with the subsystem name. Also we generally put the verb last. So I suggest __conn_cache_config_validate
as the name. The name also makes me think it is only validating the existing values. This function is both setting up the values and then validating so it is a bit misleading. Some possibilities:
- Split it into two functions like
conn_cache_config
to extract all the values and thenconn_cach_config_validate
that does all the checking. - Like the above but have it called by a wrapper that higher level functions call so callers don't need to make 2 different calls.
- Keep it as is but change
validate
to something that reflects both purposes (by looking at other similar functions and use some consistent word).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, Sue! I will be renaming the function for better clarity.
Regarding splitting the functions: the original purpose of this ticket was to change the logic for checking values in WT_CACHE so they would only be set after validation. In other words, for invalid configurations, they would not be set, and the function would fail immediately.
However, since we need to keep auto-fixing for now, extracting the values and setting them in WT_CACHE is the best approach. In the future, we do want to progressively move to the original idea. In that case, if we have a function that extracts and sets the cache values, it would need to be modified to include a temporary structure for storing the config values to validate and another function for setting all valid configs to the WT_CACHE structure of the current session. So it will be changed entirely.
Since there could be major changes to this part of the code, I'm open to discussing the best way to structure it now to benefit and simplify future modifications. I could also rename the function to reflect both extraction and validation, like __conn_cache_config_prepare
or __conn_cache_config_setup
. Please let me know what you think!
src/conn/conn_cache.c
Outdated
cache = conn->cache; | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "debug_mode.configuration", &cval)); | ||
debug_config = cval.val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this already done from wti_debug_mode_configuration
and, instead of needing the boolean, you can just use FLD_ISSET(conn, WT_CONN_DEBUG_CONFIGURATION)
? It is better to have one place getting the configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that the debug flags are set after setting the cache. @bzm-jas posted a comment about it in the JIRA ticket here.
I asked in our team channel if there was any specific reason why we do it in this order. The current code seems to indicate this order is required but I don't understand why and I wonder if this comment was valid before but now now:
/*
* This function expects the cache to be created so parse this after the rest of the connection
* is set up.
*/
WT_ERR(__wti_debug_mode_config(session, cfg));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree using FLD_ISSET(conn, WT_CONN_DEBUG_CONFIGURATION)
is preferable. I'd suggest
/* Debug flags are not yet set when this function runs during connection open. Set it now. */
WT_RET(__wt_config_gets(session, cfg, "debug_mode.configuration", &cval));
if (cval.val)
FLD_SET(conn->debug_flags, WT_CONN_DEBUG_CONFIGURATION);
else
FLD_CLR(conn->debug_flags, WT_CONN_DEBUG_CONFIGURATION);
if(FLD_ISSET(conn, WT_CONN_DEBUG_CONFIGURATION))
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we didn't have to read the configuration twice and call __wti_debug_mode_config
beforehand but if that's the only solution, I'd be happy with that as well!
} | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "eviction_target", &cval)); | ||
cache->eviction_target = (double)cval.val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is confusing to see that cache->eviction_target
was just checked above and now we're setting it. Maybe add a comment after the conditional that now we're resetting the values via reconfigure or restart or however we get here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, comments added in: c7c0742
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dug into the callstacks a bit from what I can tell the code looks like:
__wti_cache_create()
__wti_cache_config()
__cache_config_local()
__validate_cache_config(create=false) // This makes sure that triggers are greater than targets
__validate_cache_config(create=true) // This re-validates that triggers are greater than targets
so whenever we call __validate_cache_config(create=true)
we must have already set the triggers to be greater than the targets in __validate_cache_config(create=false)
.
I think we can drop the second call to __validate_cache_config(create=true)
and remove the create=true
logic from this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the create=true
logic and the second call to __validate_cache_config(create=true)
in: 9e4baa7
src/conn/conn_cache.c
Outdated
cache->eviction_target = (double)cval.val; | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "eviction_trigger", &cval)); | ||
cache->eviction_trigger = (double)cval.val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone can reconfigure all the values, don't we need to repeat all the same invalid checks we did on create
? For example, if one can reconfigure both cache->eviction_target
and cache->eviction_trigger
don't we need to check like at line 69? The checks below do not repeat the checks above but do potentially change values in an invalid way.
If so, then this code should be re-arranged to have one set of those EINVAL conditionals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that both checks should be performed, but using EINVAL for all checks causes a lot of failures, I will look into finding an approach to log these messages without breaking our tests.
@@ -577,6 +577,9 @@ def __ge__(self, other): | |||
adjust log removal to retain the log records of this number of checkpoints. Zero | |||
or one means perform normal removal.''', | |||
min='0', max='1024'), | |||
Config('configuration', 'false', r''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning to extend this functionality to other configs? If debug_mode.configuration
only applies to the cache config I think debug_mode.cache_configuration
is a better name.
If we make the above change WT_CONN_DEBUG_CONFIGURATION
could also do with a rename.
src/conn/conn_cache.c
Outdated
cache = conn->cache; | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "debug_mode.configuration", &cval)); | ||
debug_config = cval.val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree using FLD_ISSET(conn, WT_CONN_DEBUG_CONFIGURATION)
is preferable. I'd suggest
/* Debug flags are not yet set when this function runs during connection open. Set it now. */
WT_RET(__wt_config_gets(session, cfg, "debug_mode.configuration", &cval));
if (cval.val)
FLD_SET(conn->debug_flags, WT_CONN_DEBUG_CONFIGURATION);
else
FLD_CLR(conn->debug_flags, WT_CONN_DEBUG_CONFIGURATION);
if(FLD_ISSET(conn, WT_CONN_DEBUG_CONFIGURATION))
...
src/conn/conn_cache.c
Outdated
@@ -61,6 +175,8 @@ __cache_config_local(WT_SESSION_IMPL *session, bool shared, const char *cfg[]) | |||
conn = S2C(session); | |||
cache = conn->cache; | |||
|
|||
__validate_cache_config(session, cfg, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__validate_cache_config(session, cfg, false); | |
WT_RET(__validate_cache_config(session, cfg, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed here: 9e4baa7
src/conn/conn_cache.c
Outdated
WT_RET(__wt_config_gets(session, cfg, "cache_overhead", &cval)); | ||
cache->overhead_pct = (u_int)cval.val; | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "eviction_target", &cval)); | ||
cache->eviction_target = (double)cval.val; | ||
WT_RET( | ||
__cache_config_abs_to_pct(session, &(cache->eviction_target), "eviction target", shared)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These __cache_config_abs_to_pct
calls are modifying the field we used to read on the previous line. They need to be kept directly after the = cval.val
calls to make sure our validations checks don't compare (for example) a trigger measured as an absolute value against a target defined as a percentage.
This could lead to a case like
total_cache = 200MB
trigger = 100MB (50% of cache)
target = 0.8 (80% of cache)
but our validation reports a valid config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally __cache_config_abs_to_pct
uses conn->cache_size
to convert to a percentage, so when we move these __cache_config_abs_to_pct
calls we also need to make sure conn->cache_size
gets read before we call __validate_cache_config
. Currently we're calling __validate_cache_config
first.
The way we access conn->cache
inside __cache_config_abs_to_pct
feels like a footgun to me. I'd prefer if we passed conn->cache
into __cache_config_abs_to_pct
as an argument instead of invisibly accessing S2C(session)->conn_cache. We can do this in a separate ticket if you don't want to add more changes to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up, I've modified __validate_cache_config
so now the values gets changed to percentage straight away after retrieving and setting the value. See this commit: 9e4baa7
I'd prefer to change the way we access conn->cache
in another ticket because I think there's already quite a bit of changes going on for this PR, it might be clearer to do for a seperate ticket.
} | ||
|
||
WT_RET(__wt_config_gets(session, cfg, "eviction_target", &cval)); | ||
cache->eviction_target = (double)cval.val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dug into the callstacks a bit from what I can tell the code looks like:
__wti_cache_create()
__wti_cache_config()
__cache_config_local()
__validate_cache_config(create=false) // This makes sure that triggers are greater than targets
__validate_cache_config(create=true) // This re-validates that triggers are greater than targets
so whenever we call __validate_cache_config(create=true)
we must have already set the triggers to be greater than the targets in __validate_cache_config(create=false)
.
I think we can drop the second call to __validate_cache_config(create=true)
and remove the create=true
logic from this function.
src/conn/conn_cache.c
Outdated
if (cache->eviction_dirty_target > cache->eviction_target) { | ||
if (debug_config) | ||
__wt_verbose_warning(session, WT_VERB_CONFIGURATION, | ||
"config eviction_dirty_target=%f is larger than eviction_target=%f, changing " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General wording suggestion for this and the other warning messages
"config eviction_dirty_target=%f is larger than eviction_target=%f, changing " | |
"config eviction_dirty_target (%f) cannot exceed than eviction_target (%f). Setting" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment rephrased in: 9e4baa7
src/conn/conn_cache.c
Outdated
"config eviction_updates_target=%f is less than DBL_EPSILON=%f,changing " | ||
"eviction_updates_target to %f.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DBL_EPSILON
is an internal variable name. We should avoid displaying it to the end user:
"config eviction_updates_target=%f is less than DBL_EPSILON=%f,changing " | |
"eviction_updates_target to %f.", | |
"config eviction_updates_target (%f) cannot be zero. Setting to 50% of eviction_updates_target (%f)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment rephrased in: 9e4baa7
class test_config12(wttest.WiredTigerTestCase): | ||
|
||
@contextmanager | ||
def expect_verbose(self, config, patterns, expect_output = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can define this function inside WiredTigerTestCase
instead of having two definitions here and in test_verbose01.py
.
@etienneptl you wrote the original test_verbose01.py
. Was there a reason we defined expect_verbose in test_verbose01.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply because it was the only test using that function and it was designed to meet the needs of verbose01
. This version is slightly different than the one defined in verbose01
but I agree that it should be accessible from any test, it can be useful.
with self.expect_verbose('debug_mode=(configuration=true)', ['config checkpoint target=.*is less than eviction dirty target=.*', | ||
'config eviction updates target=.*is less than DBL_EPSILON=.*', | ||
'config eviction updates trigger=.*is less than DBL_EPSILON=.*']): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some more tests cases for non-default configs? For example we're currently not testing cases like cache->eviction_dirty_trigger > cache->eviction_trigger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will add at least one test for each of the checks.
test/suite/test_config12.py
Outdated
def test_config12(self): | ||
self.conn.close() | ||
# Test the default WT connection configuration with debug mode enabled. Expect warning messages. | ||
with self.expect_verbose('debug_mode=(configuration=true)', ['config eviction_checkpoint_target=.*cannot be less than eviction_dirty_target=.*Setting eviction_checkpoint_target to.*', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we define the second arg as a variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, defined as a variable here: 5f55183
No description provided.