Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard complaints on startup: x509: failed to load system roots and no roots provided #1287

Closed
timstoop opened this issue Sep 27, 2016 · 27 comments

Comments

@timstoop
Copy link

Issue details

I'm following the documentation at http://kubernetes.io/docs/user-guide/ui/, but it fails at the first step already. The container fails to start with the log entry:

Starting HTTP server on port 9090
Creating API server client for https://10.101.10.1:443
E0927 10:59:50.111556       1 config.go:267] Expected to load root CA config from /var/run/secrets/kubernetes.io/serviceaccount/ca.crt, but got err: open /var/run/secrets/kubernetes.io/serviceaccount/ca.crt: no such file or directory
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.101.10.1:443/version: x509: failed to load system roots and no roots provided

Nowhere on that page is it explained how to deal with this issue and a Google search doesn't provide enlightment. We have serviceaccounts enabled and the pod has the default one attached. When I take a look at the serviceaccount with describe, I get the following:

Name:           default
Namespace:      kube-system
Labels:         <none>

Image pull secrets:     <none>

Mountable secrets:      default-token-6x2t1

Tokens:                 default-token-6x2t1

I have no idea how to continue from here. Which cert is dashboard looking for? What's the best way of getting that into the container? Also, is the documentation outdated or am I doing something weird, as the (pretty simple) recipe does not seem to work for me.

Environment

We're running the containers on CoreOS running on AWS. Currently running 1.3.6, planning on updating to 1.4.0 somewhere soon.

Dashboard version: v1.4.0
Kubernetes version: v1.3.6
Operating system: CoreOS stable
Node.js version: Not sure, using the default gcr image
Go version: Same as Node.js version
Steps to reproduce

Follow the guide as described here: http://kubernetes.io/docs/user-guide/ui/

Observed result
Starting HTTP server on port 9090
Creating API server client for https://10.101.10.1:443
E0927 10:59:50.111556       1 config.go:267] Expected to load root CA config from /var/run/secrets/kubernetes.io/serviceaccount/ca.crt, but got err: open /var/run/secrets/kubernetes.io/serviceaccount/ca.crt: no such file or directory
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to 
a server that does not exist. Reason: Get https://10.101.10.1:443/version: x509: failed to load system roots and no roots provided

And than the pod stays in CrashLoopBackOff status.

Expected result

A working UI!

@batikanu
Copy link
Contributor

have you checked --apiserver-host parameter in kubernetes-dashboard.yaml file if it is right configured?

kubernetes-dashboard.yaml
...
args:
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
...

@floreks
Copy link
Member

floreks commented Sep 29, 2016

It's better to connect to API server through kubernetes service if HTTPS is used, so I'd leave it commented.

According to this

E0927 10:59:50.111556       1 config.go:267] Expected to load root CA config from /var/run/secrets/kubernetes.io/serviceaccount/ca.crt, but got err: open /var/run/secrets/kubernetes.io/serviceaccount/ca.crt: no such file or directory

It looks like either dashboard is not picking up service account or your cluster is not configured properly. If it's the first issue then try to delete default secrets and then delete dashboard pod. Secrets should be recreated automatically and dashboard should then pick it up on restart and use them.

Second issue is more complex. You need correct CA/server certificates provided to api server. If service account certificate is not provided then server certificate will be used.

@timstoop
Copy link
Author

Thanks for the replies. The ServiceAccount may actually be wrong, as it seems to only generate a token, nothing x509 based. Do we need to set something on apiserver side to have it add the certs to the ServiceAccount as well?

@timstoop
Copy link
Author

timstoop commented Sep 29, 2016

Adding the --apiserver-host option did not solve the problem, error stays the same, except that it now adds the DNS name of the apiserver to it (the ip address in my original report was correct as well, btw).

It seems to want a certificate and I have no way how I should provide that :/

@colemickens
Copy link
Contributor

Can you please share the flags you're passing to kube-apiserver? I suspect that it's not being configured with the root ca cert properly, leading to it not being dropped in the runtime directory properly.

@floreks
Copy link
Member

floreks commented Sep 30, 2016

Cluster configuration

Keep in mind that this is my dev configuration. I'm also using certificate based authentication to connect to the cluster. You can enable more authentication/authorization plugins if you want. This is just my basic setup.

API Server

--bind-address=0.0.0.0 \
--etcd-servers=http://127.0.0.1:2379 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--secure-port=443 \
--advertise-address=192.168.0.101 \
--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota \
--tls-cert-file=/home/floreks/kubernetes/apiserver.crt \
--tls-private-key-file=/home/floreks/kubernetes/apiserver.key \
--client-ca-file=/home/floreks/kubernetes/ca.crt \
--service-account-key-file=/home/floreks/kubernetes/apiserver.key

Kubelet

--require-kubeconfig \
--kubeconfig=/home/floreks/.kube/config \
--allow-privileged=true \
--cluster-domain=cluster.local \
--hostname-override=floreks-ms-7916 \
--cluster-dns=10.0.0.10

Controller manager

--kubeconfig=/home/floreks/.kube/config \
--service-account-private-key-file=/home/floreks/kubernetes/apiserver.key \
--root-ca-file=/home/floreks/kubernetes/ca.crt

Proxy

--kubeconfig=/home/floreks/.kube/config \
--proxy-mode=iptables

Scheduler

--kubeconfig=/home/floreks/.kube/config

Kubeconfig

current-context: default-context
apiVersion: v1
clusters:
- cluster:
    certificate-authority: /home/floreks/kubernetes/ca.crt
    server: https://192.168.0.101
  name: default-cluster
contexts:
- context:
    cluster: default-cluster
    user: admin
  name: default-context
- context:
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate: /home/floreks/kubernetes/admin.crt
    client-key: /home/floreks/kubernetes/admin.key

Certificates configuration

I'm using my simple script to generate needed certs. Correct SAN address/hostname needs to be set in openssl config file.

Config & script

floreks@floreks-MS-7916:~/kubernetes$ cat worker-openssl.cnf 
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.0.101

floreks@floreks-MS-7916:~/kubernetes$ cat openssl.cnf 
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
IP.1 = 10.0.0.1
IP.2 = 192.168.0.101
floreks@floreks-MS-7916:~/kubernetes$ cat generate-certs.sh 
#!/bin/bash

# Generate CA
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -days 365 -out ca.crt -subj "/CN=kube-ca"

# Generate api server
openssl genrsa -out apiserver.key 2048
openssl req -new -key apiserver.key -out apiserver.csr -subj "/CN=kube-apiserver" -config openssl.cnf
openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out apiserver.crt -days 365 -extensions v3_req -extfile openssl.cnf

# Generate kubelet
openssl genrsa -out kubelet.key 2048
openssl req -new -key kubelet.key -out kubelet.csr -subj "/CN=kubelet" -config worker-openssl.cnf
openssl x509 -req -in kubelet.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out kubelet.crt -days 365 -extensions v3_req -extfile worker-openssl.cnf

# Generate admin
openssl genrsa -out admin.key 2048
openssl req -new -key admin.key -out admin.csr -subj "/CN=kube-admin"
openssl x509 -req -in admin.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out admin.crt -days 365

By installing my admin certificate in browser I can connect to deployed dashboard. More about how to do this kubernetes/kubernetes#31665.

zrzut ekranu z 2016-09-30 09-38-47

Note: You may have to delete default secrets and dashboard pod in order for it to pick up service accounts. After that it should work.

@timstoop
Copy link
Author

timstoop commented Sep 30, 2016

Of course:

apiserver:

/usr/local/bin/apiserver \
--allow-privileged=true \
--bind-address=0.0.0.0 \
--secure-port=443 \
--etcd-servers=http://127.0.0.1:2379 \
--advertise-address=${COREOS_PRIVATE_IPV4} \
--service-cluster-ip-range=10.101.10.0/23 \
--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota \
--enable-swagger-ui=true \
--logtostderr=true \
--cloud-provider=aws \
--tls-cert-file=/etc/k8s/api-cert \
--tls-private-key-file=/etc/k8s/api-key \
--client-ca-file=/etc/k8s/ca-cert \
--service-account-key-file=/etc/k8s/api-key \
--token-auth-file=/etc/k8s/tokens

For good measure, here's controller as well:

/usr/local/bin/controller-manager \
--address=0.0.0.0 \
--logtostderr=true \
--master=${INSECURE_KUBERNETES_API_ENDPOINT} \
--service-account-private-key-file=/etc/k8s/api-key \
--root-ca-file=/etc/k8s/ca-cert \
--cloud-provider=aws 

And scheduler:

/usr/local/bin/scheduler \
--address=0.0.0.0 \
--logtostderr=true \
--master=${INSECURE_KUBERNETES_API_ENDPOINT}

Also, our kubeconfig:

apiVersion: v1
kind: Config
clusters: 
- cluster:
    certificate-authority: /etc/k8s/ca-cert
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubelet
  name: kubelet
current-context: kubelet
users:
- name: kubelet
  user:
    token: XXXX

@timstoop
Copy link
Author

timstoop commented Oct 4, 2016

So the ServiceAccount now includes the ca.crt (not sure what I changed), but I still get the same message:

Starting HTTP server on port 9090
Creating API server client for https://kubernetes.default.svc:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://kubernetes.default.svc:443/version: x509: failed to load system roots and no roots provided

I'm a bit out of ideas where else to look for a solution :/

@timstoop
Copy link
Author

timstoop commented Oct 4, 2016

I've started another instance to check the SSL, when I use the following:

openssl s_client -connect kubernetes.default.svc:443 -CAfile /run/secrets/kubernetes.io/serviceaccount/ca.crt

The output ends with Verify return code: 0 (ok), so it seems the certificate is correct and all. It's also using the default ServiceAccount, which is what the dashboard pod is using as well.

Any idea?

@bryk
Copy link
Contributor

bryk commented Oct 5, 2016

@timstoop What about other addon containers? Do they run correctly? E.g., heapster.

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

@bryk Kube-dns runs without a problem.

I've been looking into this issue by adding the dashboard to an alpine container (which allows me to debug) and strace-ing the process. I can see it check the standard CA certificates directories, but the only thing it gets out the serviceaccount data is:

stat("/var/run/secrets/kubernetes.io/serviceaccount/token", {st_mode=S_IFREG|0644, st_size=856, ...}) = 0

It's not even trying to open "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"... That explains the error, but does not make sense to me? Can I force dashboard to use the ca.crt that's part of the serviceaccount somehow?

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

For reference, inside the container I can do this without a problem:

/opt/dashboard # openssl s_client -CAfile /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -connect kubernetes.default.svc:443
[...]
    Verify return code: 0 (ok)

@bryk
Copy link
Contributor

bryk commented Oct 5, 2016

How about using --kubeconfig param of dashboard? And pass your local kubeconfig there?

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

Alas:

/opt/dashboard # ./dashboard --kubeconfig=/etc/k8s/kubeconfig
unknown flag: --kubeconfig
Usage of ./dashboard:
      --alsologtostderr value          log to standard error as well as files
      --apiserver-host string          The address of the Kubernetes Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8080. If not specified, the assumption is that the binary runs inside aKubernetes cluster and local discovery is attempted.
      --heapster-host string           The address of the Heapster Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8082. If not specified, the assumption is that the binary runs inside aKubernetes cluster and service proxy will be used.
      --log-flush-frequency duration   Maximum number of seconds between log flushes (default 5s)
      --log_backtrace_at value         when logging hits line file:N, emit a stack trace (default :0)
      --log_dir value                  If non-empty, write log files in this directory
      --logtostderr value              log to standard error instead of files (default true)
      --port int                       The port to listen to for incoming HTTP requests (default 9090)
      --stderrthreshold value          logs at or above this threshold go to stderr (default 2)
  -v, --v value                        log level for V logs
      --vmodule value                  comma-separated list of pattern=N settings for file-filtered logging

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

But this is kind of interesting:

/opt/dashboard # ./dashboard --apiserver-host https://kubernetes.default.svc:443 -v 999
Starting HTTP server on port 9090
Creating API server client for https://kubernetes.default.svc:443
I1005 08:08:10.428465      33 round_trippers.go:299] curl -k -v -XGET  -H "User-Agent: dashboard/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Accept: application/json, */*" https://kubernetes.default.svc:443/version
I1005 08:08:10.438527      33 round_trippers.go:318] GET https://kubernetes.default.svc:443/version  in 10 milliseconds
I1005 08:08:10.438554      33 round_trippers.go:324] Response Headers:
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://kubernetes.default.svc:443/version: x509: failed to load system roots and no roots provided

It seems to be not passing the token at all?

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

Heh... So I installed curl inside the container, and now I have a different error:

/opt/dashboard # ./dashboard --apiserver-host https://kubernetes.default.svc:443 -v 999
Starting HTTP server on port 9090
Creating API server client for https://kubernetes.default.svc:443
I1005 08:11:04.628901      52 round_trippers.go:299] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: dashboard/v0.0.0 (linux/amd64) kubernetes/$Format" https://kubernetes.default.svc:443/version
I1005 08:11:09.830962      52 round_trippers.go:318] GET https://kubernetes.default.svc:443/version  in 5201 milliseconds
I1005 08:11:09.830997      52 round_trippers.go:324] Response Headers:
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://kubernetes.default.svc:443/version: x509: certificate signed by unknown authority

Which makes sense, as it is still not using the serviceaccount's ca.crt.

@bryk
Copy link
Contributor

bryk commented Oct 5, 2016

Ah, the kubeconfig flag has been added after 1.4 release. Can you check out latest :canary tag or compile dashboard at HEAD?

@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

That worked! Thanks!

@timstoop timstoop closed this as completed Oct 5, 2016
@bryk
Copy link
Contributor

bryk commented Oct 5, 2016

I'm still wondering why the default didn't work for you... I'll keep this open for further investigation.

@bryk bryk reopened this Oct 5, 2016
@timstoop
Copy link
Author

timstoop commented Oct 5, 2016

Ok, I'll keep subscribed to this, so feel free to ask questions if you need answers! Happy to help.

@luohoufu
Copy link

i my centos host,i mount /etc/pki ,but i get another error the server has asked for the client to provide credentials. how to set client cert & cert-key file to dashboad.
volumeMounts:
- name: "etcpki"
mountPath: "/etc/pki"
readOnly: true
- name: "config"
mountPath: "/etc/kubernetes"
readOnly: true
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: "etcpki"
hostPath:
path: "/etc/pki"
- name: "config"
hostPath:
path: "/etc/kubernetes"

i do some try, but not suceess.

$ kubectl delete pod kube-dns-v19-xg5or --namespace=kube-system
$ kubectl delete pod kubernetes-dashboard-v1.4.1-g1we7 --namespace=kube-system

when i use --kubeconfig , i get panic.

@luohoufu
Copy link

i will try v1.4.2.

@luohoufu
Copy link

luohoufu commented Nov 12, 2016

i find the reason at last。i can use --kubeconfig . in my kubeconfig i set cert file in other folder,so i must add mount volumn to dashboard, if you kubeconfig file use client-certificate-data ,which encode by base64 ,you will not need mount.

@sgeisbacher
Copy link

sgeisbacher commented Nov 28, 2016

@timstoop
i investigated the same problem, that ca.crt is never read by the process.
have you solved the problem by using a kubeconfig-file?
if so, how have you put the dynamically generated /var/run/..../token in it, or have you created a new token?
or have you created a keypair for your dashboard?

but i dont want a hardcoded token or a keypair for each service. i want to use the automatically generated token from /var/run/../token + its sibling ca.crt

@timstoop
Copy link
Author

@sgeisbacher I switched to the canary release and it worked immediately.

@maciaszczykm
Copy link
Member

Closing as stale. For >1.6 clusters it is needed to use service accounts to run Dashboard.

@itkroplis
Copy link

I can used my own wildcard certificate from godady *.myname.com ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants