New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MON-3763: Add cnv_abnormal #2291
base: master
Are you sure you want to change the base?
Conversation
/hold blocked by kubevirt/hyperconverged-cluster-operator#2855 |
ae4876a
to
6fbd10f
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: avlitman, sradco The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New changes are detected. LGTM label has been removed. |
/unhold Created created https://issues.redhat.com/browse/MON-3763 @jan--f Will appreciate your review and approval, also should I cherry-pick this to 4.16? since I see we are already have 4.17. thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at https://github.com/kubevirt/hyperconverged-cluster-operator/blob/a98cb26b9ad8774cd676f703600bb0fe407875cc/pkg/monitoring/rules/recordingrules/operator.go#L26-L33, I'm not sure why you have a recording rule only renaming the metric name. Why not sending the original metric?
For reference:
@@ -240,6 +240,12 @@ data: | |||
# | |||
# owners: (https://github.com/kubevirt) | |||
# | |||
# cnv_abnormal represents the reason why the operator might have an issue | |||
# and includes the node, namespace, container, reason labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why include the node name? Is it meaningful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the node
label wasn't mentioned in https://issues.redhat.com/browse/MON-3763
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to node label for sure. will mention it in the ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the namespace is removed, so removed also from the yaml.
For now 4.16 == 4.17 (until the 4.16 branch is officially cut). |
you'll need to run |
@simonpasquier done (: |
/hold |
@avlitman: This pull request references MON-3763 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
cnv_abnormal holds issues with the pods for each container, e.g. memory exceeded value. Signed-off-by: avlitman <alitman@redhat.com>
@avlitman: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add new recording rule about cnv issues to telemeter, as for now the only issue is memory exceeded.