Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORS-3466: CAPG set instance group name #8314

Merged
merged 6 commits into from May 3, 2024

Conversation

patrickdillon
Copy link
Contributor

Sets the instance group name in capg to user "master" to be compatible with MAPI conventions, rather than "apiserver" as used by CAPI.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 24, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 24, 2024

@patrickdillon: This pull request references CORS-3466 which is a valid jira issue.

In response to this:

Sets the instance group name in capg to user "master" to be compatible with MAPI conventions, rather than "apiserver" as used by CAPI.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@patrickdillon
Copy link
Contributor Author

/label platform/google

@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-ovn

Copy link
Contributor

openshift-ci bot commented Apr 24, 2024

@patrickdillon: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test agent-integration-tests
  • /test altinfra-images
  • /test altinfra-openstack-capi-manifests
  • /test altinfra-periodics-images
  • /test aro-unit
  • /test e2e-agent-compact-ipv4
  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-edge-zones-manifest-validation
  • /test e2e-aws-ovn-upi
  • /test e2e-azure-ovn
  • /test e2e-azure-ovn-upi
  • /test e2e-gcp-ovn
  • /test e2e-gcp-ovn-upi
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-openstack-ovn
  • /test e2e-vsphere-ovn
  • /test e2e-vsphere-upi
  • /test gofmt
  • /test golint
  • /test govet
  • /test images
  • /test okd-images
  • /test okd-scos-images
  • /test okd-unit
  • /test okd-verify-codegen
  • /test openstack-manifests
  • /test shellcheck
  • /test terraform-images
  • /test terraform-verify-vendor
  • /test tf-lint
  • /test unit
  • /test verify-codegen
  • /test verify-vendor
  • /test yaml-lint

The following commands are available to trigger optional jobs:

  • /test altinfra-e2e-aws-custom-security-groups
  • /test altinfra-e2e-aws-ovn
  • /test altinfra-e2e-aws-ovn-fips
  • /test altinfra-e2e-aws-ovn-imdsv2
  • /test altinfra-e2e-aws-ovn-localzones
  • /test altinfra-e2e-aws-ovn-proxy
  • /test altinfra-e2e-aws-ovn-public-ipv4-pool
  • /test altinfra-e2e-aws-ovn-shared-vpc
  • /test altinfra-e2e-aws-ovn-shared-vpc-local-zones
  • /test altinfra-e2e-aws-ovn-shared-vpc-wavelength-zones
  • /test altinfra-e2e-aws-ovn-single-node
  • /test altinfra-e2e-aws-ovn-wavelengthzones
  • /test altinfra-e2e-azure-capi-ovn
  • /test altinfra-e2e-gcp-capi-ovn
  • /test altinfra-e2e-gcp-ovn-byo-network-capi
  • /test altinfra-e2e-gcp-ovn-secureboot-capi
  • /test altinfra-e2e-gcp-ovn-xpn-capi
  • /test altinfra-e2e-ibmcloud-capi-ovn
  • /test altinfra-e2e-nutanix-capi-ovn
  • /test altinfra-e2e-openstack-capi-ccpmso
  • /test altinfra-e2e-openstack-capi-ccpmso-zone
  • /test altinfra-e2e-openstack-capi-dualstack
  • /test altinfra-e2e-openstack-capi-dualstack-upi
  • /test altinfra-e2e-openstack-capi-dualstack-v6primary
  • /test altinfra-e2e-openstack-capi-externallb
  • /test altinfra-e2e-openstack-capi-nfv-intel
  • /test altinfra-e2e-openstack-capi-ovn
  • /test altinfra-e2e-openstack-capi-proxy
  • /test altinfra-e2e-powervs-capi-ovn
  • /test altinfra-e2e-vsphere-capi-ovn
  • /test altinfra-e2e-vsphere-capi-static-ovn
  • /test altinfra-e2e-vsphere-capi-zones
  • /test azure-ovn-marketplace-images
  • /test e2e-agent-compact-ipv4-appliance
  • /test e2e-agent-compact-ipv4-appliance-diskimage
  • /test e2e-agent-compact-ipv4-none-platform
  • /test e2e-agent-ha-dualstack
  • /test e2e-agent-sno-ipv4-pxe
  • /test e2e-agent-sno-ipv6
  • /test e2e-aws-custom-security-groups
  • /test e2e-aws-overlay-mtu-ovn-1200
  • /test e2e-aws-ovn-edge-zones
  • /test e2e-aws-ovn-fips
  • /test e2e-aws-ovn-imdsv2
  • /test e2e-aws-ovn-proxy
  • /test e2e-aws-ovn-public-subnets
  • /test e2e-aws-ovn-shared-vpc
  • /test e2e-aws-ovn-shared-vpc-edge-zones
  • /test e2e-aws-ovn-single-node
  • /test e2e-aws-ovn-upgrade
  • /test e2e-aws-ovn-workers-rhel8
  • /test e2e-aws-upi-proxy
  • /test e2e-azure-ovn-resourcegroup
  • /test e2e-azure-ovn-shared-vpc
  • /test e2e-azurestack
  • /test e2e-azurestack-upi
  • /test e2e-crc
  • /test e2e-gcp-ovn-shared-vpc
  • /test e2e-gcp-ovn-xpn
  • /test e2e-gcp-secureboot
  • /test e2e-gcp-upgrade
  • /test e2e-gcp-upi-xpn
  • /test e2e-ibmcloud-ovn
  • /test e2e-metal-assisted
  • /test e2e-metal-ipi-ovn
  • /test e2e-metal-ipi-ovn-dualstack
  • /test e2e-metal-ipi-ovn-swapped-hosts
  • /test e2e-metal-ipi-ovn-virtualmedia
  • /test e2e-metal-single-node-live-iso
  • /test e2e-nutanix-ovn
  • /test e2e-openstack-ccpmso
  • /test e2e-openstack-ccpmso-zone
  • /test e2e-openstack-dualstack
  • /test e2e-openstack-dualstack-upi
  • /test e2e-openstack-externallb
  • /test e2e-openstack-nfv-intel
  • /test e2e-openstack-proxy
  • /test e2e-vsphere-static-ovn
  • /test e2e-vsphere-upi-zones
  • /test e2e-vsphere-zones
  • /test e2e-vsphere-zones-techpreview
  • /test okd-e2e-agent-compact-ipv4
  • /test okd-e2e-agent-ha-dualstack
  • /test okd-e2e-agent-sno-ipv6
  • /test okd-e2e-aws-ovn
  • /test okd-e2e-aws-ovn-upgrade
  • /test okd-e2e-gcp
  • /test okd-e2e-gcp-ovn-upgrade
  • /test okd-e2e-vsphere
  • /test okd-scos-e2e-aws-ovn
  • /test okd-scos-e2e-aws-upgrade
  • /test okd-scos-e2e-gcp
  • /test okd-scos-e2e-gcp-ovn-upgrade
  • /test okd-scos-e2e-vsphere
  • /test okd-scos-unit
  • /test okd-scos-verify-codegen
  • /test tf-fmt

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-installer-master-altinfra-e2e-aws-ovn
  • pull-ci-openshift-installer-master-altinfra-images
  • pull-ci-openshift-installer-master-altinfra-periodics-images
  • pull-ci-openshift-installer-master-aro-unit
  • pull-ci-openshift-installer-master-e2e-aws-custom-security-groups
  • pull-ci-openshift-installer-master-e2e-aws-ovn
  • pull-ci-openshift-installer-master-e2e-aws-ovn-edge-zones
  • pull-ci-openshift-installer-master-e2e-aws-ovn-edge-zones-manifest-validation
  • pull-ci-openshift-installer-master-e2e-aws-ovn-fips
  • pull-ci-openshift-installer-master-e2e-aws-ovn-imdsv2
  • pull-ci-openshift-installer-master-e2e-aws-ovn-shared-vpc
  • pull-ci-openshift-installer-master-e2e-aws-ovn-shared-vpc-edge-zones
  • pull-ci-openshift-installer-master-e2e-aws-ovn-single-node
  • pull-ci-openshift-installer-master-e2e-gcp-ovn
  • pull-ci-openshift-installer-master-e2e-gcp-ovn-shared-vpc
  • pull-ci-openshift-installer-master-e2e-gcp-ovn-xpn
  • pull-ci-openshift-installer-master-e2e-gcp-secureboot
  • pull-ci-openshift-installer-master-gofmt
  • pull-ci-openshift-installer-master-golint
  • pull-ci-openshift-installer-master-govet
  • pull-ci-openshift-installer-master-images
  • pull-ci-openshift-installer-master-okd-e2e-aws-ovn-upgrade
  • pull-ci-openshift-installer-master-okd-images
  • pull-ci-openshift-installer-master-okd-scos-images
  • pull-ci-openshift-installer-master-okd-scos-unit
  • pull-ci-openshift-installer-master-okd-scos-verify-codegen
  • pull-ci-openshift-installer-master-okd-unit
  • pull-ci-openshift-installer-master-okd-verify-codegen
  • pull-ci-openshift-installer-master-shellcheck
  • pull-ci-openshift-installer-master-tf-fmt
  • pull-ci-openshift-installer-master-tf-lint
  • pull-ci-openshift-installer-master-unit
  • pull-ci-openshift-installer-master-verify-codegen
  • pull-ci-openshift-installer-master-verify-vendor
  • pull-ci-openshift-installer-master-yaml-lint

In response to this:

/test altinfra-e2e-gcp-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

@bfournie
Copy link
Contributor

Just need one more change to use "master" here instead of "apiserver" https://github.com/openshift/installer/blob/master/pkg/infrastructure/gcp/clusterapi/network.go#L133

The instancegroups are now properly named, e.g.:

$ gcloud compute instance-groups list --uri 
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-b/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-b
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-c/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-c
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-d/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-d

@patrickdillon
Copy link
Contributor Author

Just need one more change to use "master" here instead of "apiserver" https://github.com/openshift/installer/blob/master/pkg/infrastructure/gcp/clusterapi/network.go#L133

The instancegroups are now properly named, e.g.:

$ gcloud compute instance-groups list --uri 
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-b/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-b
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-c/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-c
https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/zones/us-east1-d/instanceGroups/bfournie-capg-test-vlc9t-master-us-east1-d

Thanks. Pushed!

That's the same error as ci turned up. Let's see if this fixes it!

/test altinfra-e2e-gcp-capi-ovn

@bfournie
Copy link
Contributor

/retest

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2024
@patrickdillon patrickdillon force-pushed the bump-capg branch 2 times, most recently from 77b46c3 to b59516a Compare April 25, 2024 17:17
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2024
@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

1 similar comment
@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

@patrickdillon
Copy link
Contributor Author

nice, after a failure in the last run install has completed!

�[36mINFO�[0m[2024-04-25T21:26:34Z] Step e2e-gcp-capi-ovn-ipi-install-install succeeded after 53m47s. 

@patrickdillon
Copy link
Contributor Author

capi e2es failed with:

: [Jira:"Networking / router"] monitor test service-type-load-balancer-availability setup expand_more 	11m20s
: [sig-arch] events should not repeat pathologically for ns/openshift-marketplace expand_more 	0s
: [OLM][invariant] alert/KubePodNotReady should not be at or above info in ns/openshift-marketplace expand_more 	0s
: [sig-instrumentation] Prometheus [apigroup:image.openshift.io] when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early][apigroup:config.openshift.io] [Skipped:Disconnected] [Suite:openshift/conformance/parallel] expand_more
: [sig-operator] an end user can use OLM can subscribe to the operator [apigroup:config.openshift.io] [Skipped:Disconnected] [Skipped:NoOptionalCapabilities] [Suite:openshift/conformance/parallel] expand_more

Running another e2e test to see which failures stick:
/test altinfra-e2e-gcp-capi-ovn

e2e failures shouldn't be a blocker for this PR, but we will need to determine if any of these failures are caused by the capi install process.

@patrickdillon
Copy link
Contributor Author

/retest-required

@patrickdillon
Copy link
Contributor Author

patrickdillon commented Apr 29, 2024

/retest-required
/test altinfra-e2e-gcp-capi-ovn

only

 [Jira:"Networking / router"] monitor test service-type-load-balancer-availability setup expand_less 	11m17s
{  failed during setup
could not reach http://34.170.127.198:80/echo?msg=hello reliably: timed out waiting for the condition}

is failing now

@barbacbd
Copy link
Contributor

/cc

@openshift-ci openshift-ci bot requested a review from barbacbd April 29, 2024 15:22
@bfournie
Copy link
Contributor

/retest

Copy link
Contributor

@barbacbd barbacbd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @bfournie

/lgtm to me bob what do you think?

@openshift-ci openshift-ci bot requested a review from bfournie April 30, 2024 17:53
@patrickdillon
Copy link
Contributor Author

@patrickdillon seeing some e2e failures due to oauth2. As @r4f4 pointed out the last time I updated capg, may need to keep the oauth2 pinned, see #8153 (comment)

oh ok I see the replace in the provider. will fix tomorrow. firewall as I have it is not fixing the e2e. so let's tackle that separately.

@bfournie
Copy link
Contributor

bfournie commented May 2, 2024

It looks like the service-type-load-balancer-availability test which was failing before is now passing after the healthcheck fix
time="2024-05-01T22:50:28Z" level=info msg="response Body:{\"ProwJobName\":\"pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn\",\"ProwJobRunID\":1785766397066874880,\"Release\":\"Presubmits\",\"CompareRelease\":\"4.16\",\"Tests\":[{\"Name\":\"[Jira:\\\"Networking / router\\\"] monitor test service-type-load-balancer-availability setup\",\"Risk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"This test has passed 100.00% of 21 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.\"]},\"OpenBugs\":[]}],\"OverallRisk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"Maximum failed test risk: High\"]},\"OpenBugs\":[]}\n"

@patrickdillon
Copy link
Contributor Author

It looks like the service-type-load-balancer-availability test which was failing before is now passing after the healthcheck fix time="2024-05-01T22:50:28Z" level=info msg="response Body:{\"ProwJobName\":\"pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn\",\"ProwJobRunID\":1785766397066874880,\"Release\":\"Presubmits\",\"CompareRelease\":\"4.16\",\"Tests\":[{\"Name\":\"[Jira:\\\"Networking / router\\\"] monitor test service-type-load-balancer-availability setup\",\"Risk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"This test has passed 100.00% of 21 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.\"]},\"OpenBugs\":[]}],\"OverallRisk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"Maximum failed test risk: High\"]},\"OpenBugs\":[]}\n"

It looks like that result is coming from periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn. which is still on terraform. the capi presubmit is still failing the e2e: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_installer/8314/pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn/1785766397066874880

@bfournie
Copy link
Contributor

bfournie commented May 2, 2024

It looks like the service-type-load-balancer-availability test which was failing before is now passing after the healthcheck fix time="2024-05-01T22:50:28Z" level=info msg="response Body:{\"ProwJobName\":\"pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn\",\"ProwJobRunID\":1785766397066874880,\"Release\":\"Presubmits\",\"CompareRelease\":\"4.16\",\"Tests\":[{\"Name\":\"[Jira:\\\"Networking / router\\\"] monitor test service-type-load-balancer-availability setup\",\"Risk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"This test has passed 100.00% of 21 runs on jobs ['periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn'] in the last 14 days.\"]},\"OpenBugs\":[]}],\"OverallRisk\":{\"Level\":{\"Name\":\"High\",\"Level\":100},\"Reasons\":[\"Maximum failed test risk: High\"]},\"OpenBugs\":[]}\n"

It looks like that result is coming from periodic-ci-openshift-release-master-ci-4.16-e2e-gcp-ovn. which is still on terraform. the capi presubmit is still failing the e2e: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_installer/8314/pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn/1785766397066874880

That was from the alinfra test "ProwJobName\":\"pull-ci-openshift-installer-master-altinfra-e2e-gcp-capi-ovn\". The e2e test is still failing but it will be interesting to see what happens after oath2 is pinned. Anyway, once that is pinned we should be able to merge this and work on any other e2e failures in a separate PR.

@patrickdillon
Copy link
Contributor Author

fixed oauth replace in 02fa61c

I removed the commit containing the attempted fix at the ingress e2e test failure. will follow up in a separate pr

@patrickdillon
Copy link
Contributor Author

/retest

@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

@bfournie
Copy link
Contributor

bfournie commented May 2, 2024

/retest

@bfournie
Copy link
Contributor

bfournie commented May 2, 2024

/test altinfra-e2e-gcp-capi-ovn

@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

The previous runs had a build error which I missed locally because my cached binary was skipping the build. This resolves the build error but oauth2 is still unpinned. I still need to resolve that

Pulls in changes needed for custom instance groups.

go get sigs.k8s.io/cluster-api-provider-gcp@main
go mod tidy
Pulls in custom instance group functionality.
go get sigs.k8s.io/cluster-api-provider-gcp@main && go mod tidy
go mod vendor
Updates CAPG infra components to bring in customizable instance groups.
Sets the instance group role tag to "master" to be consistent with
the role tag expected by MAPI--rather than "apiserver" used by CAPI.
@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

1 similar comment
@patrickdillon
Copy link
Contributor Author

/test altinfra-e2e-gcp-capi-ovn

@patrickdillon
Copy link
Contributor Author

/retest-required

@patrickdillon
Copy link
Contributor Author

Install succeeded on the last run, but looks like there were CI scheduling issues that caused a lot of jobs to fail.

Copy link
Contributor

@sadasu sadasu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 3, 2024
Copy link
Contributor

openshift-ci bot commented May 3, 2024

@patrickdillon: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-aws-ovn-upgrade bb8b059 link false /test okd-e2e-aws-ovn-upgrade
ci/prow/e2e-gcp-secureboot bb8b059 link false /test e2e-gcp-secureboot
ci/prow/altinfra-e2e-gcp-capi-ovn bb8b059 link false /test altinfra-e2e-gcp-capi-ovn
ci/prow/e2e-gcp-ovn-xpn bb8b059 link false /test e2e-gcp-ovn-xpn
ci/prow/e2e-aws-ovn-edge-zones 8c7d510 link false /test e2e-aws-ovn-edge-zones
ci/prow/e2e-gcp-ovn-shared-vpc bb8b059 link false /test e2e-gcp-ovn-shared-vpc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit a6d9a1a into openshift:master May 3, 2024
20 of 25 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-baremetal-installer-container-v4.17.0-202405040320.p0.ga6d9a1a.assembly.stream.el9 for distgit ose-baremetal-installer.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. platform/google
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants