New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unable to retrieve the complete list of server APIs #6361
Comments
I have the same issue on AKS, though the error message is
my config :
|
I believe in my case this issue started recently... it seems to be in relation to having knative installed in my case (On IBM Cloud IKS this is a managed option). I've uninstalled knative and am ok for now, but there could be an interop issue here @kalioz out of interest are you using knative on AWS? It looks not actually since I can't see the tekton objects |
I have just seen this issue myself. In my case it was cert-manager that triggered the problem. Still working on how to get it back to how it was. |
@planetf1 I'm not using knative (or i think i don't), but the problem only exist on the new cluster I deployed for this test.
So i have some major changes. To me the problem is that helm3 crash because of the lack of access to some apis, who are not used for the chart i'm trying to deploy. |
I am using it on k8 cluster version 1.13.9, same error is coming for deploying any stable chart. helm version helm.go:81: [debug] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request. |
After resolving the issue from the metrics pod (can't remember how I solved it, i think it might have to do with hostNetwork or simply restarting the associated pod) helm3 function as expected. |
It's really, really annoying as someone starting out with Kubernetes. I'm hand rolling a solution for certificates using acme, since I can't guarantee that cert manager won't still be broken even after configuring it. The really annoying part is I can't just use helm to uninstall cert manager and get back to where I was! Anything which allows a strongly recommended service to break it, and won't undo the change is broken. |
For anyone who hits this, it's caused by api-services that no longer have backends running... In my case it was KEDA, but there are a number of different services that install aggregated API servers. To fix it:
Look for ones the If you don't need those APIs any more, delete them:
Then Helm should work properly. I think improving the Helm error message for this case may be worthwhile... |
Thanks for the explanation - is there a way Helm could code around this too? |
We think so, though we're still investigating. My first look suggests that this is just related to our usage of the Discovery API, which is used for the |
Same with
This is pretty annoying. Warning instead of failing would be much better indeed. EDIT: can s/o confirm |
@sjentzsch I am also seeing the same using Helm |
If this does also affect 2.x then everyone using "cert-manager" (possibly only pre-configuration) is going to have a bad time. |
Here we have two different cases with the same behavior from helm side. As @technosophos mentioned helm uses discovery API functionality and fails if any of API response fails Lines 105 to 118 in f1dc847
and for this case you can easily fix it by
Currently, it's alive and running but was down accidentally during the helm's request.
I'm sure that helm must be more robust for such type of issues,
|
We have similar issue with 2.15.1 on Kubernetes 1.15.5, but NOT with helm 2.14.3. The issue is floating: some charts are installed OK, but then they begin to fail.
|
We hit this issue when trying to upgrade to Helm 2.15.2 on the charts CI cluster. So, it's not only a Helm 3 issue. Deleting the missing API service fixed it. I wonder if Helm could be more graceful here, especially since this could probably pop up again any time. |
Hit a similar problem installing the stable/metrics-server chart on a kubeadm installed cluster. When you attempt to uninstall the chart, the uninstall fails with an api-server error (because metrics server is fubar), and that leaves a load of dangling resources lying around that you have to clean up by hand - since helm has removed the release from its database anyway.
|
Started hitting this recently in freshly created GKE clusters, using 2.15.1 (might have upgraded recently via Snap). Also reported as kubernetes/kubernetes#72051 (comment). Seem to be able to work around by preceding every kubectl --namespace=kube-system wait --for=condition=Available --timeout=5m apiservices/v1beta1.metrics.k8s.io |
@jglick In your case is it happening only when the cluster is first created? The problem is deep down in the Kubernetes Go discovery client. I am experimenting with just printing a warning. However, that could have negative consequences for charts that heavily rely on the Capabilities object. |
This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is a stop-gap until Client-Go gets fixed. Closes helm#6361 Signed-off-by: Matt Butcher <matt.butcher@microsoft.com>
This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is a stop-gap until Client-Go gets fixed. Closes helm#6361 Signed-off-by: Matt Butcher <matt.butcher@microsoft.com>
This blocks a particular error (caused by upstream discovery client), printing a warning instead of failing. It's not a great solution, but is a stop-gap until Client-Go gets fixed. Closes helm#6361 Signed-off-by: Matt Butcher <matt.butcher@microsoft.com>
Can confirm I'm having this issue as well. Hoping for a fix. |
Solution: The steps I followed are:
For me it was:
|
kubectl api-resources is easily broken, e.g. https://access.redhat.com/solutions/4379461 helm/helm#6361 (comment) But we need to be able to deploy until the cluster admins fix it.
@brendandburns Glad to have found your answer after a few hours of googling! :-D Too bad this is not StackOverflow, you'd deserve quite a few upvotes ;-) |
On Amazon EKS I had to uninstall their metrics server. That cleaned up the error.
Use this command to verify that you don't get any more errors.
https://docs.aws.amazon.com/eks/latest/userguide/metrics-server.html |
This helped a lot. |
Similar to the fix in helm (helm/helm#6361), this fix allows GroupDiscoveryFailedError to not error out the process of managing apiserver resource types. https://github.com/gabe-l-hart/operator-sdk/issues/5596 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
…work#5596) Similar to the fix in helm (helm/helm#6361), this fix allows GroupDiscoveryFailedError to not error out the process of managing apiserver resource types. Fixes operator-framework#5596 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
There is a typo: |
Output of
helm version
:version.BuildInfo{Version:"v3.0+unreleased", GitCommit:"180db556aaf45f34516f8ddb9ddac28d71736a3e", GitTreeState:"clean", GoVersion:"go1.13"}
Output of
kubectl version
:lient Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T12:36:28Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3+IKS", GitCommit:"66a72e7aa8fd2dbf64af493f50f943d7f7067916", GitTreeState:"clean", BuildDate:"2019-08-23T08:07:38Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.):
IBM Cloud
Helm chart deployment fails with:
(The first error is in a confluent chart... here I discuss the second issue)
Looking at the error I see a similar problem with
Then looking at 'action.go' in the source I can see that if this api call fails, we exit getCapabilities(). I understand why ... but is this failure too 'hard' - in the case above the error was a minor service?
This seems to have come up recently due to some changes on the k8s service with metrics.
I will persue that seperately... but was after thoughts on how helm handles this situation
Also a heads up helm3 may be broken on IKS - but I'm not knowledgeable enough to dig much further?
The text was updated successfully, but these errors were encountered: