Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor common component metrics #11184

Merged
merged 3 commits into from Apr 9, 2024

Conversation

machadovilaca
Copy link
Member

@machadovilaca machadovilaca commented Feb 7, 2024

What this PR does

Before this PR:

Old metric implementation style for client, workqueue and reflector metrics

After this PR:

According to kubevirt/community#219, refactor client, workqueue and reflector metrics to follow new approach

jira-ticket: https://issues.redhat.com/browse/CNV-27306
jira-ticket: https://issues.redhat.com/browse/CNV-38135
jira-ticket: https://issues.redhat.com/browse/CNV-38136

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note

none

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. area/monitoring kind/build-change Categorizes PRs as related to changing build files of virt-* components size/XL labels Feb 7, 2024
@machadovilaca
Copy link
Member Author

/cc @enp0s3

@machadovilaca machadovilaca force-pushed the refactor-client-metrics branch 2 times, most recently from b40b83f to 8615658 Compare February 8, 2024 15:38
@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.29-sig-monitoring

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.29-sig-monitoring

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.29-sig-monitoring

Copy link
Contributor

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Hi, thanks for the PR, looks good.
I saw that you've added new files, can you please change the naming style to snake case instead of camel case? its for consistency.

@@ -184,6 +172,31 @@ func getMetricsNotIncludeInEndpointByDefault() metricList {
return metrics
}

func getVirtControllerMetrics() metricList {
err := virt_controller.SetupMetrics(nil, nil, nil, nil, nil, nil, nil, nil)
checkError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkError is really redundant, it doesn't to the coding style IMO, there is no added value in calling a dedicated function for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was previously done in this script, do you think I can/makes sense to edit the whole file and remove the checkErrors?

@@ -33,7 +33,10 @@ import (
// https://sdk.operatorframework.io/docs/best-practices/observability-best-practices/#metrics-guidelines
// should be ignored.
var excludedMetrics = map[string]struct{}{
"kubevirt_vmi_phase_count": struct{}{},
"kubevirt_vmi_phase_count": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to ignore them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this collector is used by the linter, and these metrics do not follow the naming conventions, that's why I'm excluding them from the linter, while we don't refactor them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth changing the names to start with kubevirt_?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we need to do so, but I don't think it should be as part of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Please add kubevirt_ prefix to all the metrics.
You already added reflector_ to the reflector metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm avoiding renaming any metrics as part of this PRs
reflector was already part of the metrics names through the 'subsytem':

const reflectorSubsystem = "reflector"

I'm also avoiding using the subsystem method because of this, IMO it makes it harder to find metrics by name and it is easier to generate confusion

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.29-sig-monitoring

@machadovilaca
Copy link
Member Author

@enp0s3 updated

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 23, 2024
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 27, 2024
@assafad assafad mentioned this pull request Mar 7, 2024
8 tasks
@kubevirt-bot kubevirt-bot added sig/buildsystem Denotes an issue or PR that relates to changes in the build system. size/XXL sig/scale and removed size/XL labels Mar 8, 2024
@machadovilaca machadovilaca changed the title Refactor client metrics Refactor common component metrics Mar 8, 2024
@acardace
Copy link
Member

acardace commented Apr 3, 2024

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: acardace

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 3, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.27-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-storage
/test pull-kubevirt-e2e-k8s-1.27-sig-compute
/test pull-kubevirt-e2e-k8s-1.27-sig-operator
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

5 similar comments
@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fossedihelm
Copy link
Contributor

/hold
monitoring failure is relevant
@machadovilaca

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 5, 2024
Signed-off-by: João Vilaça <jvilaca@redhat.com>
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 8, 2024
@machadovilaca
Copy link
Member Author

machadovilaca commented Apr 8, 2024

/hold monitoring failure is relevant @machadovilaca

fails due to the merge of #11307
reflector metrics were not being created for a while now, they were deprecated in Kubernetes because they were causing memory leaks: kubernetes/kubernetes#74636

the method in controller-runtime was later removed in kubernetes-sigs/controller-runtime#1946, which doesn't affect KubeVirt yet

Reflector metrics were removed in Kubernetes
due to them causing memory leaks:
kubernetes/kubernetes#74636

Signed-off-by: João Vilaça <jvilaca@redhat.com>
Signed-off-by: João Vilaça <jvilaca@redhat.com>
@machadovilaca
Copy link
Member Author

/test pull-kubevirt-unit-test-arm64
/test pull-kubevirt-e2e-k8s-1.29-sig-storage

@sradco
Copy link
Contributor

sradco commented Apr 8, 2024

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 8, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.27-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-storage
/test pull-kubevirt-e2e-k8s-1.27-sig-compute
/test pull-kubevirt-e2e-k8s-1.27-sig-operator
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator

@machadovilaca
Copy link
Member Author

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2024
@kubevirt-bot
Copy link
Contributor

@machadovilaca: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.29-sig-compute 58fb25a link unknown /test pull-kubevirt-e2e-k8s-1.29-sig-compute

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit a2e118f into kubevirt:main Apr 9, 2024
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/monitoring dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/build-change Categorizes PRs as related to changing build files of virt-* components lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/buildsystem Denotes an issue or PR that relates to changes in the build system. sig/scale size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants