Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: libpe_rules: New timezone attribute in date_expression of rules #3068

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

nrwahl2
Copy link
Contributor

@nrwahl2 nrwahl2 commented Apr 5, 2023

Users can now add a "timezone" attribute to the date_expression element of a rule. The value must adhere to one of the formats for the TZ environment variable described in the tzset(3) man page.

If "timezone" is set, the date_expression is evaluated based on it. For example, suppose the system is using the America/Los_Angeles (std: UTC-8, dst: UTC-7) timezone, and the current time is 17:30. Suppose there's a rule with the following date_expression:

<date_expression id="location-dummy-rule-expr" operation="date_spec">
  <date_spec id="location-dummy-rule-expr-datespec" hours="20-21"/>
</date_expression>

This rule will not apply because it's out of range. However, suppose we
add a timezone attribute so that it's evaluated with respect to the
America/New_York (std: UTC-5, dst: UTC-4) time zone:

<date_expression id="location-dummy-rule-expr" operation="date_spec"
                 timezone=":America/New_York">
  <date_spec id="location-dummy-rule-expr-datespec" hours="20-21"/>
</date_expression>

Now the date_spec evaluation passes, and the rule applies. 17:30 in the America/Los_Angeles timezone is equivalent to 20:30 in the America/New_York timezone, and 20:30 is within the date_spec's range.

Other timezone formats are possible. Refer to the tzset(3) man page.

Closes T646
Closes RHBZ#2183466

@kgaillot
Copy link
Contributor

kgaillot commented Apr 5, 2023

ISO 8601 does support time zone offsets, though only hardcoded amounts, not locality strings that can take daylight saving time into account. So there is some question of how the system behaves if both the new field and an ISO 8601 offset are specified (which might just require documenting). We could reduce the scope for conflict by only allowing the locality names in the new field, since ISO 8601 can handle a numeric offset.

@kgaillot
Copy link
Contributor

kgaillot commented Apr 5, 2023

See also #2737 (T188)

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Apr 5, 2023

ISO 8601 does support time zone offsets, though only hardcoded amounts, not locality strings that can take daylight saving time into account.

hours="X" isn't part of the ISO 8601 standard though AFAIK. Date specs rather than dates were the originally reported issue. The new field addresses both date specs and dates.

So there is some question of how the system behaves if both the new field and an ISO 8601 offset are specified (which might just require documenting). We could reduce the scope for conflict by only allowing the locality names in the new field, since ISO 8601 can handle a numeric offset.

Documenting is probably the way to go. So far I've only skimmed the PR that you linked, but I did notice that our parsing of ISO 8601 strings is um... not robust. For one thing, cts-cli uses a couple of date/time strings of the format "2020-01-01 01:00:00 -0500". IDK yet whether "-0500" is allowed by the standard, but Pacemaker ignores it completely.

I think the best scenario would be that our time parsing functions honor whatever ISO 8601-correct timezone a user passes in (whether it does that currently is questionable and probably not reliable), and then the new timezone field overrides or shifts the originally parsed time (for start and end fields).

I'd like to call the parsing fixes out-of-scope. I wanted to mess with our time library as little as possible until the switch to GDateTime.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Apr 5, 2023

The CI failures on FreeBSD are due to something that's not POSIX-compliant in the new sed expressions, unrelated to rules.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Apr 5, 2023

IDK yet whether "-0500" is allowed by the standard, but Pacemaker ignores it completely.

On second thought, I'm not 100% sure whether it is ignored; it might get stored in crm_time_t:offset but wasn't relevant to the way the rules were configured. IIRC I was trying various tests with the TZ variable set differently from the string offset.

The parsing in crm_time_parse_offset() does not strike me as complete or robust though, at a glance

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Apr 5, 2023

Updated to address discussion:

  • Make sed expressions POSIX-compliant, so that hopefully they'll work on FreeBSD.
  • Make handling of timezones in start and end more consistent, to the extent we can do so without fixing the parsing.
  • Update the documentation to state that:
    • Parsing of timezones from ISO 8601 strings is incomplete and should not be relied upon.
    • start and end are adjusted to the specified or local timezone for proper comparison to the current time; this takes place after parsing and applying any timezone that may be present in the string.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Apr 5, 2023

Oh and rebased on current main... crap, that makes reviewing the actual changes more difficult.

@kgaillot
Copy link
Contributor

kgaillot commented May 4, 2023

The thought of a rule in resource defaults using one time zone and a rule for a resource's instance attributes using another seems like trouble.

A cluster property has the problem that we'd have to use the system timezone to evaluate the properties themselves. We could possibly do that just to get the timezone then re-evaluate the rest using the configured one, even if that's a bit awkward. Honestly, anyone trying to use rules to put the cluster in one timezone during one time period and a different timezone during another time period deserves whatever happens.

A sysconfig variable would still be open to the same problem as using the system timezone, but it would let admins set a cluster time zone separate from the system one.

Nothing's ideal, but I'm leaning to a cluster property.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented May 4, 2023

The thought of a rule in resource defaults using one time zone and a rule for a resource's instance attributes using another seems like trouble.

Yeah, I wouldn't expect it to misbehave though (haven't given it deep thought). It sounds like a bizarre configuration that would behave exactly how the user configured it to. Just don't do that if you don't want the rules to have different time zones.

A cluster property has the problem that we'd have to use the system timezone to evaluate the properties themselves. We could possibly do that just to get the timezone then re-evaluate the rest using the configured one, even if that's a bit awkward.

That was the initial plan and it's probably doable.

Honestly, anyone trying to use rules to put the cluster in one timezone during one time period and a different timezone during another time period deserves whatever happens.

100% agreed. Although it still bothers me a tiny bit to knowingly introduce a bug in a corner case.

A sysconfig variable would still be open to the same problem as using the system timezone, but it would let admins set a cluster time zone separate from the system one.

You wouldn't even need a new sysconfig variable. Just set the TZ variable inside the sysconfig file, and it will apply only to Pacemaker and (presumably) its child processes. I think the only downside is that it has to be done manually on every node.

@kgaillot
Copy link
Contributor

kgaillot commented May 4, 2023

It sounds like a bizarre configuration that would behave exactly how the user configured it to.

Yes, but the admin would have to remember to set it whenever creating a new rule (and won't necessarily be the same admin who created earlier rules). It would be a recurring problem.

Although it still bothers me a tiny bit to knowingly introduce a bug in a corner case.

It wouldn't be a bug, just a documented limitation.

You wouldn't even need a new sysconfig variable. Just set the TZ variable inside the sysconfig file, and it will apply only to Pacemaker and (presumably) its child processes. I think the only downside is that it has to be done manually on every node.

The other downside I forgot to mention above is that the command-line tools don't read sysconfig, so e.g. crm_simulate and crm_rule would give results based on the host timezone regardless of what's in sysconfig.

So, probably a cluster property is best.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented May 4, 2023

The other downside I forgot to mention above is that the command-line tools don't read sysconfig

Ah that's right, we discussed that in the T646 task I believe

@nrwahl2 nrwahl2 marked this pull request as draft August 8, 2023 15:01
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Aug 8, 2023

Rebased on current main and dropped test update commit (rather than re-running to fix conflict), no other changes

Internal-only

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Internal-only

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Internal-only

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
crm_time_get_timezone() should convert dt->offset to hours and minutes,
rather than converting dt->seconds. Additionally, the output variables
must be signed, since the offset can be negative.

However, we can't change the interface since it's public API. Create a
new internal function to get timezone correctly.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This function has never worked correctly (it uses dt->seconds instead of
dt->offset, and it stores output values as unsigned). Nothing uses it
internally, so we deprecate it here. It will be removed in a future
release.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Given a timezone expressed as hours and minutes offset from UTC, adjust
the object's time to match the timezone and then set the object's
timezone.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This will be used as a new cluster property name.

Ref T646

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
For the new "timezone" attribute of date_expression elements in rules.

Ref T646

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Closes T646
Closes RHBZ#2183466

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 10, 2023

Rebased on current main, have not implemented cluster-timezone property yet

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 11, 2023

@kgaillot Alright, I came back to this, thinking it'd be a trivial task before starting my options work on T620 :)

There's another wrinkle with implementing this as a property: refreshing the cluster-wide timezone in a running cluster (if it gets updated) is tough. Every daemon needs to use the configured timezone. Aside from any unforeseen side effects that updating it on the fly might have, some daemons aren't yet reading CIB property set at all -- let alone refreshing it after a CIB update.

My first thought was "Okay, let's document that cluster-timezone is processed only at Pacemaker start time, and implement it that way. We'll read it in pacemakerd, set TZ, and let the child processes inherit the value." That likely doesn't work either, because there may be more than one "startup." For instance, if pacemakerd dies and respawns, it'll read the new timezone and set the environment accordingly. But if child processes are already running, pacemakerd simply finds the existing processes, rather than restarting them. So pacemakerd itself and any newly started processes get the updated timezone, while any existing processes continue using the old timezone.

We might need to give every daemon a CIB diff callback (similar to crm_read_options()) and its own option set that includes cluster-timezone, and then have it read and refresh the timezone upon CIB diff.

In turn, as appealing as it is to knock this out right now, it might be better to wait until T620 is done and the options are more standardized and less redundant.

Thoughts?

@kgaillot
Copy link
Contributor

Yeah, let's leave this aside for now. It sounds like a sysconfig option would be better, but then it's less helpful since it still has to be set on every node. The only advantage would be to have a "cluster time" that's different from local time.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 11, 2023

Yeah, let's leave this aside for now. It sounds like a sysconfig option would be better, but then it's less helpful since it still has to be set on every node. The only advantage would be to have a "cluster time" that's different from local time.

As you've reminded me a couple of times, a sysconfig option wouldn't have an effect for CLI tools.

@kgaillot
Copy link
Contributor

Yeah, let's leave this aside for now. It sounds like a sysconfig option would be better, but then it's less helpful since it still has to be set on every node. The only advantage would be to have a "cluster time" that's different from local time.

As you've reminded me a couple of times, a sysconfig option wouldn't have an effect for CLI tools.

Ah of course. T574 would be a fix for that, but it would still require setting it separately on each node.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 11, 2023

Ah of course. T574 would be a fix for that, but it would still require setting it separately on each node.

We could rely on high-level CLI tools (pcs, crmsh) to set the sysconfig variable on all nodes, though that's less appealing. Also semantically, the sysconfig file is best suited for local settings. Most of them don't modify behavior in a meaningful way, and the few that do (node start state and crash handling) are restricted to local effects.

Timezone certainly can be a local effect but in the Pacemaker context it'd be ideal to keep it inherently in sync across the cluster, due to things like consistent rule evaluation.


Giving the remaining daemons a CIB diff callback that re-reads the CIB properties and sets the relevant options (which would include cluster-timezone for all daemons) might not actually be so hard -- though it's more work than I planned to do for this task.

It has another big downside though, which is that some daemons (execd, pacemakerd, schedulerd) don't have CIB connections. The scheduler somewhat gets around it by re-unpacking the config regularly. However, ideally everything would use the cluster-timezone whether it be for rule evaluation or for displaying timestamps or whatever.


For all its flaws, a timezone attribute for rules is the only one of the three approaches where such edge cases don't obviously come up. Those flaws mostly stem from the likelihood of a future admin forgetting to set the attribute. It would apply only to rules and not to, say, timestamps (which is arguably another flaw); but it would in its limited, documented context -- rule evaluation.

@kgaillot
Copy link
Contributor

Yeah I definitely don't want to add CIB connections to new daemons. A better model for the cluster property approach would be to add a new IPC/CPG command to each daemon for setting the timezone, and have the controller use those when necessary.

Why would daemons need it though? The scheduler reads it from the input CIB. The executor should probably have it for recording action history, but the others probably don't care. I suppose we could use it for log timestamps but that may be not worth the hassle.

If we make it an XML attribute of the <cib> element, we don't have to worry about rule evaluation for it.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 11, 2023

A better model for the cluster property approach would be to add a new IPC/CPG command to each daemon for setting the timezone, and have the controller use those when necessary.

That might work, just have the controller spread the word to everyone. That would address the issue of updates and of consistency.

Why would daemons need it though?

Any miscellaneous uses of times (like for displaying output) and for future-proofing. It's not strictly necessary for anything except rule evaluation, but the consistency would be nice.

If we make it an XML attribute of the element, we don't have to worry about rule evaluation for it.

Oh that's a good thought. I don't know if rule evaluation is actually that big a problem -- just fetch and evaluate the cluster-timezone first and then evaluate everything else.

Making it a <cib> attribute feels a bit less ideal (it fits in nicely as a property) and would require an update to pcs/crmsh, but it avoids having to reason about edge cases with depending on a date rule, like you said.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Oct 16, 2023

While looking through undocumented environment variables, I found PCMK_respawned. If we wanted to make it so that cluster-timezone is applied only at cluster startup (depending on the approach we choose), then we could set it conditionally if PCMK_respawned is unset.

I may remove that variable soon since we only set it and never check it, but that is an idea to keep in mind.

@kgaillot
Copy link
Contributor

While looking through undocumented environment variables, I found PCMK_respawned. If we wanted to make it so that cluster-timezone is applied only at cluster startup (depending on the approach we choose), then we could set it conditionally if PCMK_respawned is unset.

I may remove that variable soon since we only set it and never check it, but that is an idea to keep in mind.

That variable has been invited to the party many times but never asked to dance :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants