Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS 1.24 metric-server 503 error #1431

Open
afilonchuk opened this issue Mar 1, 2024 · 5 comments
Open

EKS 1.24 metric-server 503 error #1431

afilonchuk opened this issue Mar 1, 2024 · 5 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@afilonchuk
Copy link

What happened: Get en error controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable , Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]

Environment: EKS 1.24 + metrics-server:v0.7.0 + aws vpc-cni
Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"darwin/arm64"}

  • Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.):

  • Container Network Setup (flannel, calico, etc.):

  • Kubernetes version (use kubectl version):

  • Metrics Server manifest
    helm chart 3.12.0

  • Metrics server logs:

spoiler for Metrics Server logs:

14 I0301 11:50:48.516799 1 serving.go:374] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key) 13 I0301 11:50:49.134213 1 handler.go:275] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager 12 I0301 11:50:49.265768 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController 11 I0301 11:50:49.265807 1 shared_informer.go:311] Waiting for caches to sync for RequestHeaderAuthRequestController 10 I0301 11:50:49.265965 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file" 9 I0301 11:50:49.266000 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 8 I0301 11:50:49.266085 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file" 7 I0301 11:50:49.266132 1 shared_informer.go:311] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 6 I0301 11:50:49.266457 1 secure_serving.go:213] Serving securely on [::]:4443 5 I0301 11:50:49.266567 1 dynamic_serving_content.go:132] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key" 4 I0301 11:50:49.266936 1 tlsconfig.go:240] "Starting DynamicServingCertificateController" 3 I0301 11:50:49.365943 1 shared_informer.go:318] Caches are synced for RequestHeaderAuthRequestController 2 I0301 11:50:49.366054 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 1 I0301 11:50:49.366219 1 shared_informer.go:318] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file

  • Status of Metrics API:

Name: v1beta1.metrics.k8s.io Namespace: Labels: app.kubernetes.io/instance=metrics-server app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=metrics-server app.kubernetes.io/version=0.7.0 argocd.argoproj.io/instance=metrics-server helm.sh/chart=metrics-server-3.12.0 Annotations: <none> API Version: apiregistration.k8s.io/v1 Kind: APIService Metadata: Creation Timestamp: 2024-03-01T11:50:46Z Resource Version: 198063091 UID: 727329a3-9bb5-4e7c-9db5-111d17049c8d Spec: Group: metrics.k8s.io Group Priority Minimum: 100 Insecure Skip TLS Verify: true Service: Name: metrics-server Namespace: metrics-server Port: 443 Version: v1beta1 Version Priority: 100 Status: Conditions: Last Transition Time: 2024-03-01T11:51:16Z Message: all checks passed Reason: Passed Status: True Type: Available Events: <none>
Screenshot 2024-03-01 at 15 08 39
Screenshot 2024-03-01 at 15 09 17
Screenshot 2024-03-01 at 15 09 57

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 1, 2024
@dashpole
Copy link

dashpole commented Mar 7, 2024

/assign @yangjunmyfm192085
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 7, 2024
@yangjunmyfm192085
Copy link
Contributor

hi, @afilonchuk , Are there any other error logs for metrics-server? From the currently provided log information, unable to analyze the cause.

It is also recommended to refer to
https://github.com/kubernetes-sigs/metrics-server/blob/master/KNOWN_ISSUES.md#unable-to-work-properly-in-amazon-eks

@tg-sys
Copy link

tg-sys commented Mar 15, 2024

@yangjunmyfm192085 Hello.

Just deployed Helm chart from scratch. Let me share all info I have for the moment again.

Screenshot 2024-03-15 at 16 10 20
Screenshot 2024-03-15 at 16 10 48
Screenshot 2024-03-15 at 16 13 45
Screenshot 2024-03-15 at 16 18 04

Also I've tried to create SA with EKS based over OIDC role and add it to Aws-auth configmap... no luck for the moment (((

Moreover I've updated EKS cluster from 1.24 to 1.25 and this also not helped (((

@yangjunmyfm192085
Copy link
Contributor

Hi, tg-sys, It seems that metrics-server does not get data from kubelet normally.
may be able to check in the following ways:

  1. Adjust the log level of metrics-server to 6 and check whether metrics-server can scrape node data normally.
  2. according to
    https://github.com/kubernetes-sigs/metrics-server/blob/master/KNOWN_ISSUES.md#kubelet-doesnt-report-metrics-for-all-or-subset-of-nodes, access the kubelet's api endpoint directly to see if the data can be scraped normally.

@tg-sys
Copy link

tg-sys commented Mar 29, 2024

Screenshot 2024-03-29 at 17 48 10
@yangjunmyfm192085
I've tested Log level to 6 and looks like Metrics were grabbed...
metric_server_log_6.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants