WIP: MON-3513: add resource metrics api availability test #28699

slashpai · 2024-04-10T15:03:43Z

openshift/origin e2e tests don't explicitly exercise the metrics.k8s.io API group. We should add tests under the test/extended/prometheus proving that the API works before/after upgrade.

It will help showing off that the prometheus-adpater -> metrics-server migration happens seamlessly when we turn the feature gate on by default.

openshift-ci-robot · 2024-04-10T15:03:47Z

@slashpai: This pull request references MON-3513 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set.

In response to this:

openshift/origin e2e tests don't explicitly exercise the metrics.k8s.io API group. We should add tests under the test/extended/prometheus proving that the API works before/after upgrade.

It will help showing off that the prometheus-adpater -> metrics-server migration happens seamlessly when we turn the feature gate on by default.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

slashpai · 2024-04-10T15:13:50Z

/retest

slashpai · 2024-04-10T15:18:57Z

/test verify

Signed-off-by: Jayapriya Pai <janantha@redhat.com>

openshift-ci · 2024-04-11T04:00:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: slashpai
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

test/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

rexagod · 2024-04-11T06:47:10Z

test/extended/prometheus/upgrade.go

+	g.By("verifying whether the nodes.metrics command works after the upgrade")
+	postUpgradeResponse, err := oc.Run("get").Args("nodes.metrics").Output()
+	o.Expect(err).NotTo(o.HaveOccurred())
+	o.Expect(postUpgradeResponse).NotTo(o.BeEmpty())


Not sure if there are any cases where non-emptiness could equate to an unexpected output (without any errors), but if so, would it be worth checking the contents of the output as well, just to be safe?

I think that checking for the correctness should happen on oc and/or CMO sides, IIUC the test is only to show that the API continues to work (is responsive/available) after the p-a->m-server switch (which will happen during an upgrade).
We can even get rid of the test in future versions.

If not already done, JP could provoke a failure (putting some nonexisting stuff in Args()) of the test so we get an idea about how it'd look like and to test the test :)

I know the release folks are migrating to the new monitortests framework for upgrade tests: you can see how we use it here https://github.com/openshift/origin/pull/28361/files. But for a temp test, I'm ok with keeping it like this.

(If you agree with me that it's a temp test, it'd be great to add that as a comment in the test and to have a ticket that will revert it in the future)

The idea to have the test is to make sure resource metrics api is always available. We didn't have test for that but it was getting tested indirect way for HPA. The test will be still useful even full migration (PA->MS) happened as well. But for now ya we needed a test to prove it will always be functional

We do have other tests as well so would that mean we will need to move all tests using new framework?

If it's not a temp test, I'd go for the monitortests framework (I could help if you have questions.)
But let's see what the others think.
(I'd be interested in a commit that provokes a failure in both cases though)

Good remark on monitortests, I'm definitely not up to date with the origin testing 😅
It looks like a good option and the statefulsets example is a good starting point from what I can tell.

As discussed with JP, we'll go with #28737

slashpai · 2024-04-11T11:23:33Z

/retest-required

openshift-ci · 2024-04-11T13:32:35Z

@slashpai: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-ovn-single-node-upgrade	`b28c376`	link	false	`/test e2e-aws-ovn-single-node-upgrade`
ci/prow/e2e-metal-ipi-ovn-ipv6	`b28c376`	link	true	`/test e2e-metal-ipi-ovn-ipv6`
ci/prow/e2e-gcp-csi	`b28c376`	link	false	`/test e2e-gcp-csi`
ci/prow/e2e-metal-ipi-sdn	`b28c376`	link	false	`/test e2e-metal-ipi-sdn`
ci/prow/e2e-aws-ovn-single-node-serial	`b28c376`	link	false	`/test e2e-aws-ovn-single-node-serial`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-trt-bot · 2024-04-11T13:33:43Z

Job Failure Risk Analysis for sha: b28c376

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6	IncompleteTests Tests for this run (23) are below the historical average (811): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

slashpai · 2024-04-24T12:26:44Z

Closing in favour of #28737

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 10, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 10, 2024

openshift-ci bot requested review from jan--f and simonpasquier April 10, 2024 15:05

slashpai force-pushed the metrics-server branch from a2c5f95 to 03aea86 Compare April 10, 2024 15:06

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2024

MON-3513: add resource metrics api availability test

b28c376

Signed-off-by: Jayapriya Pai <janantha@redhat.com>

slashpai force-pushed the metrics-server branch from 03aea86 to b28c376 Compare April 11, 2024 03:59

openshift-ci bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 11, 2024

rexagod reviewed Apr 11, 2024

View reviewed changes

slashpai closed this Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: MON-3513: add resource metrics api availability test #28699

WIP: MON-3513: add resource metrics api availability test #28699

slashpai commented Apr 10, 2024

openshift-ci-robot commented Apr 10, 2024 •

edited by openshift-ci bot

slashpai commented Apr 10, 2024

slashpai commented Apr 10, 2024

openshift-ci bot commented Apr 11, 2024

rexagod Apr 11, 2024

machine424 Apr 11, 2024 •

edited

machine424 Apr 11, 2024

slashpai Apr 11, 2024

machine424 Apr 11, 2024

simonpasquier Apr 11, 2024

machine424 Apr 24, 2024

slashpai commented Apr 11, 2024

openshift-ci bot commented Apr 11, 2024

openshift-trt-bot commented Apr 11, 2024

slashpai commented Apr 24, 2024

WIP: MON-3513: add resource metrics api availability test #28699

WIP: MON-3513: add resource metrics api availability test #28699

Conversation

slashpai commented Apr 10, 2024

openshift-ci-robot commented Apr 10, 2024 • edited by openshift-ci bot

slashpai commented Apr 10, 2024

slashpai commented Apr 10, 2024

openshift-ci bot commented Apr 11, 2024

rexagod Apr 11, 2024

Choose a reason for hiding this comment

machine424 Apr 11, 2024 • edited

Choose a reason for hiding this comment

machine424 Apr 11, 2024

Choose a reason for hiding this comment

slashpai Apr 11, 2024

Choose a reason for hiding this comment

machine424 Apr 11, 2024

Choose a reason for hiding this comment

simonpasquier Apr 11, 2024

Choose a reason for hiding this comment

machine424 Apr 24, 2024

Choose a reason for hiding this comment

slashpai commented Apr 11, 2024

openshift-ci bot commented Apr 11, 2024

openshift-trt-bot commented Apr 11, 2024

slashpai commented Apr 24, 2024

openshift-ci-robot commented Apr 10, 2024 •

edited by openshift-ci bot

machine424 Apr 11, 2024 •

edited