Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

capz-controller-manager fails to get api resources #4131

Closed
lieberlois opened this issue Oct 16, 2023 · 9 comments
Closed

capz-controller-manager fails to get api resources #4131

lieberlois opened this issue Oct 16, 2023 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@lieberlois
Copy link

lieberlois commented Oct 16, 2023

/kind bug

[Before submitting an issue, have you checked the Troubleshooting Guide?]

What steps did you take and what happened:
I installed the ClusterAPI provider for Azure. The capz-controller-manager is in a crashloop because of the following logs:

E1016 11:21:18.266079       1 kind.go:68] controller-runtime/source/EventHandler "msg"="failed to get informer from cache" "error"="failed to get API group resources: unable to retrieve the complete list of server APIs: bootstrap.cluster.x-k8s.io/v1beta1: the server could not find the requested resource"

What did you expect to happen:
I expected the controller to be able to interact with the mentioned api resources.

Anything else you would like to add:
I have the Sidero ClusterAPI provider installed aswell, which also has a bootstrap.cluster.x-k8s.io/v1alpha3 apiGroup.

Environment:

  • cluster-api-provider-azure version: v1.11.3
  • Kubernetes version: (use kubectl version): 1.27.6
  • OS (e.g. from /etc/os-release): Ubuntu 22.04.3 LTS
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 16, 2023
@CecileRobertMichon
Copy link
Contributor

Does this error persist if you uninstall CAPI/CAPZ and reinstall it? are all the pods healthy on the management cluster besides the CAPZ manager? I've seen this error happen transiently once in a test when cert-manager was not running

@bryan-cox
Copy link
Contributor

I'm running into something similar, but not sure it's the same bug.

E1027 23:44:59.733236       1 controller.go:324]  "msg"="Reconciler error" "error"="failed to create scope: failed to configure azure settings and credentials for Identity: failed to create copied AzureIdentity brcox-hypershift-arm-clusters-brcox-hypershift-arm-brcox-hypershift-arm in capz-system: failed to get API group resources: unable to retrieve the complete list of server APIs: aadpodidentity.k8s.io/v1: the server could not find the requested resource" "AzureMachine"={"name":"brcox-hypershift-arm-f412695a-jnk28","namespace":"clusters-brcox-hypershift-arm"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="brcox-hypershift-arm-f412695a-jnk28" "namespace"="clusters-brcox-hypershift-arm" "reconcileID"="6af4e450-887d-4c13-9ac3-a0d6e72305a4"

@CecileRobertMichon
Copy link
Contributor

@bryan-cox same questions as above

Also found this from a quick search helm/helm#6361 (comment)

Can you check kubectl get apiservices?

@bryan-cox
Copy link
Contributor

I'll run that command and see.

I had deleted the pod but didn't try a new cluster or something like that.

@bryan-cox
Copy link
Contributor

Ran that command on my mgmt cluster this morning. I didn't have any apiservices where AVAILABLE was False.

@lieberlois
Copy link
Author

lieberlois commented Oct 30, 2023

For me, specifically specifying the bootstrap / control plane providers fixed the issue.

# This caused the issue mentioned above
clusterctl init -b talos -c talos -i sidero
clusterctl init -i azure

# This works fine
clusterctl init -b talos -c talos -i sidero
clusterctl init -i azure -b kubeadm -c kubeadm

@bryan-cox
Copy link
Contributor

I got past my issue. It was a setup issue. I needed to set the AzureClusterIdentity type to ManualServicePrincipal and my secret ref was missing the key clientSecret.

@CecileRobertMichon
Copy link
Contributor

Thanks for coming back here and letting us know what fixed it! Going to close this for now since you're both unblocked, feel free to open follow up issues for anything that could have made this easier to debug.

/close

@k8s-ci-robot
Copy link
Contributor

@CecileRobertMichon: Closing this issue.

In response to this:

Thanks for coming back here and letting us know what fixed it! Going to close this for now since you're both unblocked, feel free to open follow up issues for anything that could have made this easier to debug.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Archived in project
Development

No branches or pull requests

4 participants