Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Add recording rules to ruler API and validation #87779

Merged
merged 19 commits into from
May 21, 2024

Conversation

alexweav
Copy link
Contributor

@alexweav alexweav commented May 13, 2024

What is this feature?

Adds support for creating recording rules in the main API.

Recording rules have a new field, Record, and lack several of the other "statefulness" fields that only make sense for alerting rules.

Extends the validation logic to have two paths. Shared validation happens, then we try to detect if the rule is a recording or alerting rule, then validate the remaining fields as needed.

Why do we need this feature?

Allows users to manipulate recording rules over the API, and eventually the UI.

Which issue(s) does this PR fix?:

n/a

Special notes for your reviewer:

Please check that:

  • It works as expected from a user's perspective.
  • If this is a pre-GA feature, it is behind a feature toggle.
  • The docs are updated, and if this is a notable improvement, it's added to our What's New doc.

@grafana-delivery-bot grafana-delivery-bot bot added this to the 11.1.x milestone May 13, 2024
@alexweav alexweav force-pushed the alexweav/recording-rule-main-api branch from 3dbf09d to be43867 Compare May 14, 2024 20:41
@alexweav alexweav added no-changelog Skip including change in changelog/release notes no-backport Skip backport of PR area/alerting Grafana Alerting labels May 14, 2024
@alexweav alexweav marked this pull request as ready for review May 15, 2024 14:36
@alexweav alexweav requested review from a team as code owners May 15, 2024 14:36
@alexweav alexweav requested review from jtheory, rwwiv, JacobsonMT, yuri-tceretian and grobinson-grafana and removed request for a team May 15, 2024 14:36
@@ -530,6 +530,7 @@ func toGettableExtendedRuleNode(r ngmodels.AlertRule, provenanceRecords map[stri
Provenance: apimodels.Provenance(provenance),
IsPaused: r.IsPaused,
NotificationSettings: AlertRuleNotificationSettingsFromNotificationSettings(r.NotificationSettings),
Record: ApiRecordFromModelRecord(r.Record),
Copy link
Contributor Author

@alexweav alexweav May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always allow reading of recording rules regardless of feature toggle.

If we don't allow this, things get really weird in terms of consistency/ordering if you enable the toggle, create a RR, then disable the toggle. A more advanced solution seemed pointless as the toggle will go away in the future. Basically, if you do this, the group will still work and can be read, but can't be edited until the RR is removed.

Comment on lines -87 to -100
if len(ruleNode.GrafanaManagedAlert.Data) == 0 {
if canPatch {
if ruleNode.GrafanaManagedAlert.Condition != "" {
return nil, fmt.Errorf("%w: query is not specified by condition is. You must specify both query and condition to update existing alert rule", ngmodels.ErrAlertRuleFailedValidation)
}
} else {
return nil, fmt.Errorf("%w: no queries or expressions are found", ngmodels.ErrAlertRuleFailedValidation)
}
} else {
err = validateCondition(ruleNode.GrafanaManagedAlert.Condition, ruleNode.GrafanaManagedAlert.Data)
if err != nil {
return nil, fmt.Errorf("%w: %s", ngmodels.ErrAlertRuleFailedValidation, err.Error())
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this extra logic surrounding validateCondition is now shoved inside validateCondition. validateCondition now handles all these cases and can be directly called from any path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're referring to validateAlertingRule here right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the logic surrounding validateCondition (if canPatch { ...). It's put inside validateCondition now, so code can just call validateCondition straight up without needing to worry about this extra stuff that's also validating part of the condition.

Then on top of that, the call to validateCondition got put inside validateAlertingRule. but, the comment was referring to the surrounding logic being put inside validateCondition. That's why you don't see this whole block copied inside validateAlertingRule, just the call to validateCondition and nothing else.

Comment on lines +192 to +203
if canPatch {
// Patch requests may leave both query and condition blank. If a request supplies one, it must supply the other.
if len(queries) == 0 && condition == "" {
return nil
}
if len(queries) == 0 && condition != "" {
return fmt.Errorf("%w: query is not specified but condition is. You must specify both query and condition to update existing alert rule", ngmodels.ErrAlertRuleFailedValidation)
}
if len(queries) > 0 && condition == "" {
return fmt.Errorf("%w: condition is not specified but query is. You must specify both query and condition to update existing alert rule", ngmodels.ErrAlertRuleFailedValidation)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a less nested version of the if/else logic surrounding validateCondition's old call site. IMO it shows intent much better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, much easier to follow 👍

Comment on lines +912 to +913
Condition: "A",
Data: []apimodels.AlertQuery{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule in this test was invalid in more than one way. By reordering validation checks, it caused this test to no longer cover what it was supposed to cover.

By fixing condition, we now cover the intent of this test a bit better, and it's not so tied to implementation.

@stevesg stevesg self-requested a review May 15, 2024 14:58
Copy link
Contributor

@rwwiv rwwiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a note about side effects in the new validation functions but it isn't blocking.

@@ -145,6 +100,85 @@ func validateRuleNode(
return &newAlertRule, nil
}

func validateAlertingRule(in *apimodels.PostableExtendedRuleNode, newRule *ngmodels.AlertRule, canPatch bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The side effect modification of newRule isn't super obvious without reading the whole function, WDYT about this and validateRecordingRule either returning the validated rule or adding docstrings to mention that field is modified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair. The main reason I didn't return a new object was that it wouldn't be fully formed yet. The result of the function wouldn't be a usable alert rule, it would be missing many shared fields like Title. I thought this captured the idea better that "this function has to be used alongside other things"

I thought about instead inverting this - where the shared validations are also extracted to a function. Then, the top level would call just validateAlertingRule, which would then call sharedValidations and then all the alert specific validations. Then, it could more stably return a fully-formed rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the comment approach for now, and I also renamed the functions to validate<X>RuleFields to better express that they only validate a subset of the rule.

I think that's probably enough, if you disagree i could go the fully separated route with a function for common validation as well. The cost of this is more indirection in the code, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, pushed another update which modifies without a reference and returns the result. This also changed the call site a bit

Comment on lines +192 to +203
if canPatch {
// Patch requests may leave both query and condition blank. If a request supplies one, it must supply the other.
if len(queries) == 0 && condition == "" {
return nil
}
if len(queries) == 0 && condition != "" {
return fmt.Errorf("%w: query is not specified but condition is. You must specify both query and condition to update existing alert rule", ngmodels.ErrAlertRuleFailedValidation)
}
if len(queries) > 0 && condition == "" {
return fmt.Errorf("%w: condition is not specified but query is. You must specify both query and condition to update existing alert rule", ngmodels.ErrAlertRuleFailedValidation)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, much easier to follow 👍

pkg/services/ngalert/api/tooling/api.json Outdated Show resolved Hide resolved
@alexweav alexweav force-pushed the alexweav/recording-rule-main-api branch from e2a4dcf to 716ca2e Compare May 21, 2024 15:56
@alexweav alexweav merged commit 49c8deb into main May 21, 2024
12 checks passed
@alexweav alexweav deleted the alexweav/recording-rule-main-api branch May 21, 2024 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/alerting Grafana Alerting area/backend area/frontend no-backport Skip backport of PR no-changelog Skip including change in changelog/release notes
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

2 participants