Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm alpha phase certs renew all should also update certs in KubeConfig files #1361

Closed
adoerler opened this issue Jan 25, 2019 · 43 comments
Closed
Assignees
Labels
area/security kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@adoerler
Copy link

FEATURE REQUEST

Versions

kubeadm version v1.12.5

Environment:

What happened?

3 of my clusters are now 1 year old. As some certs are issued with 1 year validity the cluster stopped working properly. I've upgrade the clusters from 1.10.12 to 1.11.6 and 1.12.5 before the certificates reached their expiration date.

I've experienced several problems:

Even with Certificate Rotation enabled, kubelet.conf points to outdated certs

  • As Certificate Rotation has been enabled in one of the upgrades (not sure when), the pem file /var/lib/kubelet/pki/kubelet-client-current.pem was rotated correctly, but
    • on Nodes: client-certificate and client-key in /etc/kubernetes/kubelet.conf still pointed to /var/lib/kubelet/pki/kubelet-client.*
    • on Master: client-certificate-data and client-key-data in /etc/kubernetes/kubelet.conf still contained the certificate which will outdate soon.
    • I had to manually update client-certificate-data and client-key-data on all nodes and all clusters
      • Alternatively one could use sudo kubeadm alpha phase kubeconfig kubelet to regenerate this file on Master and all Nodes!

Certificate Rotation dos not update apiserver/etcd/front-proxy-client certs

  • Certificate Rotation does not seem to update any of the other certificates on Master, i.e.
    • apiserver*
    • etcd*
    • front-proxy-client

The Command kubeadm alpha phase certs renew all does not update KubeConfig files

  • I've manually issued sudo kubeadm alpha phase certs renew all on master which renews all expired certs in /etc/kubernetes/pki which is fine, BUT
    • KubeConfig files like the following are not updated:
      • /etc/kubernetes/admin.conf
      • /etc/kubernetes/controller-manager.conf
      • /etc/kubernetes/scheduler.conf
  • Therefore the static pods are still using the old certificate, so i had to use sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address=x.x.x.x
    • Additionally one have to restart the static pods (or easier the master server) to reread the new certificates.
    • It even gets worse if certificates are expired already. In this case you can kubectl -n kube-system delete pod kube-apiserver-mater which does seem to work, but in reality the pod never got restarted - I had to stop and start the container with docker stop/start.

What you expected to happen?

  • I think there is not much one could do about the first issue, if the config file is wrong, how should the cluster inform an admin...
  • Certificate rotation is responsible for kubelet, so there is also not much one could do about the second issue
  • For certs renew I would suggest to update the documentation (https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) and state when to run this command (once a year). On the first sight it is not clear if this command has to be executed on master and all nodes or just on master,...
  • I'd also suggest that the command either updates the KubeConfig files too or at least give some hints to the user that he should do it manually. It should also suggest to restart the static pods after updating the KubeConfig files
  • kubeadm alpha phase kubeconfig should either restart the static pods after config has been written or inform the user to do so.

Best regards
Andreas

@neolit123 neolit123 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/security labels Jan 25, 2019
@neolit123
Copy link
Member

@MalloZup
of course, but please note that the join phases are a high priority.

@neolit123
Copy link
Member

sounds good! thanks a lot.

@adoerler
Copy link
Author

Hi,

there is one more thing regarding this topic.

kubeadm alpha phase kubeconfig all shows this message if conf files are in place when issuing the command:

[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/scheduler.conf"

It does not check if certs are expired, so in my opinion up-to-date is missleading.

To get the updated certs into the files one MUST remove the files upfront, than the log looks like:

[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"   

In my case I though I'm fine, but a few days later to static pods couldn't communicate due to outdated certificates.

Best Regards
Andreas

@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 28, 2019
@timothysc timothysc added this to the v1.14 milestone Jan 28, 2019
@fabriziopandini fabriziopandini added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Feb 6, 2019
@fabriziopandini
Copy link
Member

Assigned to @MalloZup

@k8s-ci-robot
Copy link
Contributor

@MalloZup: GitHub didn't allow me to assign the following users: MalloZup.

Note that only kubernetes members and repo collaborators can be assigned and that issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MalloZup
Copy link

MalloZup commented Feb 6, 2019

hi @adoerler thx for issue. Regarding the misleading info i have sent a PR kubernetes/kubernetes#73798.

I will have a look on the rest of the issue once i have time. Thx for the time and precision of the issue

@MalloZup
Copy link

MalloZup commented Feb 11, 2019

@adoerler i have sent a DOC pr for your suggestion. Feel free to have a look tia 🚀
(kubernetes/website#12579)

@adoerler
Copy link
Author

Hi @MalloZup,

thanks for PR!

I'm missing a sentence about the kubeconfig files, because certs renew is only one part of the game.
Something like:

After certs have been renewed, don't forget to recreate the KubeConfig files using kubeadm alpha phase kubeconfig ...

@MalloZup
Copy link

Thx. I didnt add the doc because I was thinking that actually we could renew Also kubeconfig files. The rest restarting pods we can delegate to user and write minimal doc. @fabriziopandini @lubomir @ereslibre I m missing something on this implementation ? Tia

@fabriziopandini
Copy link
Member

@MalloZup I have not a deep knowledge of how certs renewal works.

Personally, I would like to clarify a little bit the overall history before taking actions - included what proposed above - :

  • what should be managed by kubeadm alpha phase certs renew
  • what should be managed automatically during kubeadm upgrade
  • what should be documented (and mananged by the users)
  • how this apply to HA clusters
  • how this is impacted by cluster variants (like e.g. External etcd, External CA)
  • etc.

but I leave final word to people more skilled than me in this area

@neolit123
Copy link
Member

i think we should reserve time on a meeting to discuss what our recommended certs renewal policy should be. the page about certs management might need some extra detail:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs

and we need to write a small guide, for single control plane clusters as a start at least.

what users have been doing is figuring things out on their own:
#581 (comment)
^ this comment and one above contain user made guides.

this is a sign that we need to add an official guide.
cc @timothysc @liztio

@timothysc
Copy link
Member

/assign @ereslibre

@dimm0
Copy link

dimm0 commented Feb 15, 2019

Our cluster with a couple hundreds users is stuck at the moment. Could I have a very quick guide what to do with expired cert?

@neolit123
Copy link
Member

@dimm0

what users have been doing is figuring things out on their own:
#581 (comment)
^ this comment and one above contain user made guides.

these are the only guides we have ATM.

@dimm0
Copy link

dimm0 commented Feb 15, 2019

[root@controller0 ~]# kubeadm alpha phase certs apiserver --apiserver-advertise-address 1.2.3.4
Error: unknown flag: --apiserver-advertise-address
Usage:

Flags:
  -h, --help   help for phase

Global Flags:
      --log-file string   If non-empty, use this log file
      --rootfs string     [EXPERIMENTAL] The path to the 'real' host root filesystem.
      --skip-headers      If true, avoid header prefixes in the log messages
  -v, --v Level           log level for V logs

error: unknown flag: --apiserver-advertise-address
[root@controller0 ~]# kubeadm alpha phase certs apiserver
This command is not meant to be run on its own. See list of available subcommands.

@neolit123
Copy link
Member

neolit123 commented Feb 15, 2019

in 1.13 init phases have graduated to to parent init command:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/#cmd-phase-certs

in 1.12 the flag should be there:
https://v1-12.docs.kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-alpha/#cmd-phase-certs

1.11 is soon going out of support.

@neolit123 neolit123 added kind/documentation Categorizes issue or PR as related to documentation. and removed lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. labels Mar 7, 2019
@neolit123
Copy link
Member

removing the lifecycle/active label.
moving to 1.15.

possible docs update ideas here:
#1361 (comment)

@neolit123 neolit123 modified the milestones: v1.14, v1.15 Mar 7, 2019
@tushar00jain
Copy link

@neolit123 @fabriziopandini
are the steps you mentioned also for rotating the CA cert? Can this be documented as well? What about rotating the private keys including the one for the CA?

@fabriziopandini
Copy link
Member

fabriziopandini commented May 3, 2019

@tushar00jain rotation of CA cert is tracked in another issue #1350
This issue focus on signed certs only

@neolit123
Copy link
Member

@fabriziopandini i was looking at closing this ticket today as you were able to send PRs for the renewal parts. should the ticket be closed?

Even with Certificate Rotation enabled, kubelet.conf points to outdated certs (already tracked by #1317)

yes this is tracked in a separate issue, possibly needs discussion/docs in terms of what workarounds we should provide.

Certificate Rotation does not update apiserver/etcd/front-proxy-client certs (fixed by kubernetes/kubernetes#76862)

The Command kubeadm alpha phase certs renew all does not update KubeConfig files (fixed by kubernetes/kubernetes#77180)

Documentation about certs renewal (with more detail about where the command should be run, when, kubeconfig, HA)

the 3 above should be done.

@neolit123 neolit123 removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jun 10, 2019
@neolit123 neolit123 modified the milestones: v1.15, v1.16 Jun 11, 2019
@fabriziopandini
Copy link
Member

/close
As per comment above most of the work is already completed; the missing bit is tracked in a separate/dedicated issue

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

/close
As per comment above most of the work is already completed; the missing bit is tracked in a separate/dedicated issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AndrewSav
Copy link

AndrewSav commented Oct 30, 2019

Can someone please explain to me how was the "Even with Certificate Rotation enabled, kubelet.conf points to outdated certs" part addressed? The only issue linked that mentions this explicitly closed in favour of another issues which is closed with "I'm not sure if this an issue so open a new ticket if it is".
I'm on 1.16 do not see any renewal happening at kubelet.conf with sudo kubeadm alpha certs renew all. What am missing? @neolit123

@fabriziopandini
Copy link
Member

fabriziopandini commented Oct 30, 2019

a quick recap of a very very long discussion.

  1. certificate rotation for all the certs but kubelet.conf are now managed by kubeadm alpha cert renew.
  2. certificate rotation for kubelet.conf will be managed by kubelet itself (unless the user opts out from automatic certificate rotation)

This second point already as of today works for all the nodes except the one where you run kubeadm init; kubernetes/kubernetes#84118 is going to fix that

@AndrewSav
Copy link

@fabriziopandini Thank you for this, it makes sense.

For anyone else facing the issue of the certs in kubelte.conf being out of date between now and when the above is fixed I found this article helpful:

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#check-certificate-expiration

On nodes created with kubeadm init, prior to kubeadm version 1.17, there is a bug where you manually have to modify the contents of kubelet.conf. After kubeadm init finishes, you should update kubelet.conf to point to the rotated kubelet client certificates, by replacing client-certificate-data and client-key-data with:

client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

@tannh
Copy link

tannh commented Nov 18, 2019

@AndrewSav Thank you for this. I have used the promethes operator to monitor the cluster. I recently received an alert "Kubernetes API certificate is expiring in less than 7 days", I think it is related to this issue. I have updated the content of kubelet.conf on the master nodes. But I still get the alert. Do you have any suggestions? Tks.

@AndrewSav
Copy link

@tannh if you installed the cluster with kubeadm, use kubeadm to check the certs experation. Otherwise your issue is probably not related.

@neolit123
Copy link
Member

On nodes created with kubeadm init, prior to kubeadm version 1.17, there is a bug where you manually have to modify the contents of kubelet.conf. After kubeadm init finishes, you should update kubelet.conf to point to the rotated kubelet client certificates, by replacing client-certificate-data and client-key-data with:

this will also be in the release notes for 1.17.

@SuleimanWA
Copy link

SuleimanWA commented Feb 27, 2020

@adoerler I am still running old version of kubeadm, how can I update the kubelet.conf, admin.con, ... etc, after certificate renewal?

I ran "kubeadm alpha certs renew all", which generated new certificates, then I need to edit all .conf under /etc/kubernetes, how? where exactly they should point?
and in case of multi master nodes, should I run the command in all masters?

@adoerler
Copy link
Author

Hi @SuleimanWA ,

I cannot tell you what to do on a multi master env, I've had only single master in my setup.

This is what I've done:

First of all make sure to move existing conf files out of the way, because existing files will not get overwritten!

mv /etc/kubernetes/admin.conf /backup
mv /etc/kubernetes/kubelet.conf /backup
mv /etc/kubernetes/controller-manager.conf /backup
mv /etc/kubernetes/scheduler.conf /backup

then update these files:

user@master:~$ sudo kubeadm alpha phase kubeconfig all --apiserver-advertise-address=<INSERT-YOUR-APISERVER-IP-HERE>
I0124 21:56:14.253641   15040 version.go:236] remote version is much newer: v1.13.2; falling back to: stable-1.12
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"    

To apply the new certificates in the static system pods the easiest way for me was to simply reboot the master server.

Don't forget copy client-certificate-data and client-key-data from /etc/kubernetes/admin.conf to your local .kube/config.

Hope this helps

Andreas

@provgregoryabdo
Copy link

provgregoryabdo commented Mar 2, 2020

Any idea how to run this command on 1.14.10? All I get is:

kubeadm alpha phase kubeconfig all --apiserver-advertise-address=192.168.102.170 Error: unknown flag: --apiserver-advertise-address

Then the docs say:
kubeadm alpha phase kubeconfig all
and I get:
This command is not meant to be run on its own. See list of available subcommands.

Thanks

@adoerler
Copy link
Author

adoerler commented Mar 2, 2020

Hi @provgregoryabdo,

whats your kubeadm version output?

BR Andreas

@sheer-lore
Copy link

@provgregoryabdo the phase commands moved out of alpha and to init in later versions so you can use something like

kubeadm init phase kubeconfig all --apiserver-advertise-address=<your_address>

@adoerler thanks for the help!

@Antebios
Copy link

I know this thread is old, but just in case anyone comes looking at this I have an updated answer that saved me. I used "kubeadm certs renew all" to bring my system back to life.

  1. Check your certificates with: sudo kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Oct 17, 2021 05:46 UTC   <invalid>                               no
apiserver                  Oct 17, 2021 05:46 UTC   <invalid>       ca                      no
apiserver-etcd-client      Oct 17, 2021 05:46 UTC   <invalid>       etcd-ca                 no
apiserver-kubelet-client   Oct 17, 2021 05:46 UTC   <invalid>       ca                      no
controller-manager.conf    Oct 17, 2021 05:46 UTC   <invalid>                               no
etcd-healthcheck-client    Oct 17, 2021 05:46 UTC   <invalid>       etcd-ca                 no
etcd-peer                  Oct 17, 2021 05:46 UTC   <invalid>       etcd-ca                 no
etcd-server                Oct 17, 2021 05:46 UTC   <invalid>       etcd-ca                 no
front-proxy-client         Oct 17, 2021 05:46 UTC   <invalid>       front-proxy-ca          no
scheduler.conf             Oct 17, 2021 05:46 UTC   <invalid>                               no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Mar 24, 2030 16:43 UTC   8y              no
etcd-ca                 Mar 24, 2030 16:43 UTC   8y              no
front-proxy-ca          Mar 24, 2030 16:43 UTC   8y              no

  1. All of my certificates were expired so I renewed them with sudo kubeadm certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration

certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed

Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.
  1. Check the certificates again to see their renewal status:
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Oct 29, 2022 01:48 UTC   364d                                    no
apiserver                  Oct 29, 2022 01:48 UTC   364d            ca                      no
apiserver-etcd-client      Oct 29, 2022 01:48 UTC   364d            etcd-ca                 no
apiserver-kubelet-client   Oct 29, 2022 01:48 UTC   364d            ca                      no
controller-manager.conf    Oct 29, 2022 01:48 UTC   364d                                    no
etcd-healthcheck-client    Oct 29, 2022 01:48 UTC   364d            etcd-ca                 no
etcd-peer                  Oct 29, 2022 01:48 UTC   364d            etcd-ca                 no
etcd-server                Oct 29, 2022 01:48 UTC   364d            etcd-ca                 no
front-proxy-client         Oct 29, 2022 01:48 UTC   364d            front-proxy-ca          no
scheduler.conf             Oct 29, 2022 01:48 UTC   364d                                    no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Mar 24, 2030 16:43 UTC   8y              no
etcd-ca                 Mar 24, 2030 16:43 UTC   8y              no
front-proxy-ca          Mar 24, 2030 16:43 UTC   8y              no
  1. Reboot
  2. See the master and the worker nodes come back alive and swear to the old gods and the new that I will do my backups.

@lethargosapatheia
Copy link

@Antebios This is about old versions that don't support this feature, because it's alpha. So if you really want to help, it would be nice to offer a solution for older versions (such as 1.19) where renewdoesn't work at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests