Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to update k8s server IP address #88648

Closed
ukreddy-erwin opened this issue Feb 28, 2020 · 20 comments
Closed

How to update k8s server IP address #88648

ukreddy-erwin opened this issue Feb 28, 2020 · 20 comments
Labels
kind/support Categorizes issue or PR as a support question. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@ukreddy-erwin
Copy link

The IP address of the k8s master should be updated as we moved to a different network.How to update that instead of completely resetting the kubeadm. As there are many nodes, it is pathetic to join them again.

@ukreddy-erwin ukreddy-erwin added the kind/support Categorizes issue or PR as a support question. label Feb 28, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 28, 2020
@neolit123
Copy link
Member

neolit123 commented Feb 28, 2020

kubernetes is not tolerant to "master/server IP" changes as there are certificates at play which are aware of the IP.

there was a long discussion here kubernetes/kubeadm#338
where users have proposed workarounds and guides on how to change the IP, but take those with a grain of salt.

kubeadm does not have a way to change the IP for you and there are no plans to support this.

in terms of what is the right thing to do?
avoid using IPs in your cluster mappings, and instead use DNS names:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#considerations-about-apiserver-advertise-address-and-controlplaneendpoint

/sig cluster-lifecycle network
/triage support
/close

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

kubernetes is not tolerant to "master IP" changes as there are certificates at play which are aware of your server IP.

there was a long discussion here kubernetes/kubeadm#338
where users have proposed workarounds and guides on how to change the IP, but take those with a grain of salt.

kubeadm does not have a way to change the IP for you and there are no plans to support this.

in terms of what is the right thing to do?
avoid using IPs in your cluster mappings, and instead use DNS names:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#considerations-about-apiserver-advertise-address-and-controlplaneendpoint

/sig cluster-lifecycle network
/triage support
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 28, 2020
@FerminCastro
Copy link

This is a false closure. The fact that you COULD use a control-plane-endpoint HOSTNAME (not an IP) will NOT save you AT ALL from the control plane failing when the IPs of the master nodes change. The IPs of the nodes are hardcoded everywhere. This is a 30 years old problem that Kubernestes should address: use hostnames and not IPs in the config for the control and worker plane's network references... I am surprised this has not been addressed already. This basically precludes portability of a K8 system

@neolit123
Copy link
Member

neolit123 commented Mar 4, 2021

@FerminCastro

the kubeadm docs at least tell you that you should be careful with IPs and consider DNS names:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#considerations-about-apiserver-advertise-address-and-controlplaneendpoint

The IPs of the nodes are hardcoded everywhere

can you enumerate the locations where an IP of a control-plane is hardcoded?

@ukreddy-erwin
Copy link
Author

ukreddy-erwin commented Mar 4, 2021 via email

@FerminCastro
Copy link

The point is that even if you use a controplaneenpoint such as a virtual front end load balancer. The original IPS will still appear in a number of places. For example kubeadm-config:

[opc@olk8-m1 ~]$ kubectl -n kube-system describe cm kubeadm-config | grep advertise
advertiseAddress: 10.10.0.23
advertiseAddress: 10.10.0.24
advertiseAddress: 10.10.0.25

Why can't k8 use hostnames for all those?

@neolit123
Copy link
Member

advertiseAddress: 10.10.0.23
advertiseAddress: 10.10.0.24
advertiseAddress: 10.10.0.25

these come from the kubeadm ClusterStatus which is deprecated and no longer used.
the original idea there was to track endpoints when etcd is managed on the same node as an API server, but instead kubeadm 1.19 (? or was it) switched to tracking etcd endpoint IPs in the etcd static pod.

any other examples?

@neolit123
Copy link
Member

You can't expect to change environment everytime ip changes.it should be redirected based on server name instead of ip.

not sure what you mean by "server name", but it has to be tracked somewhere

k8s components communicate to each other with kubeconfig files, which allow DNS names. one exception is etcd since it needs a list of IPs passed to the kube-apiserver, but you could still shield an etcd cluster behind a load-balancer IP, which hopefully doesn't change.

to get wider attention to this IP change, problem you can open a discussion with the SIG Architecture group mailing list
https://github.com/kubernetes/community/tree/master/sig-architecture#contact

@FerminCastro
Copy link

The etcd config itself for example. We NEVER EVER use any IPs in any of our configuration steps. However, the default etcd.yaml has the nodes Ip everywhere. We had to change the listen-peer and listen-client to use 0.0.0.0 (which will break in a number situations with multi-nic system) because there is no way to use hostnames there either...

@neolit123
Copy link
Member

neolit123 commented Mar 4, 2021

    - --advertise-client-urls=https://192.168.0.101:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://192.168.0.101:2380
    - --initial-cluster=luboitvbox=https://192.168.0.101:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.0.101:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://192.168.0.101:2380

is your request to be able to pass DNS name instead of https://192.168.0.101?

but there is a core feature request here too, since the kube-apiserver flag:

    - --etcd-servers=https://127.0.0.1:2379

doesn't support DNS names.

@FerminCastro
Copy link

And as many people has noted in other threads, most operations will fail with Unauthorized exceptions because the certs are invalid. for example, let me share a common op for us: move an etcd snapshot from one location to another where we use THE EXACT same hostnames. Things should JUST work. Give it a try :-), flannel breaks, coredns breaks, etcd breaks...

@FerminCastro
Copy link

"is your request to be able to pass DNS name instead of https://192.168.0.101?"

We would like to avoid this:

@neolit123
Copy link
Member

your best option is to add nodes from a new network and remove nodes from the old network. then only swap the LB endpoint. patch coredns / CNI whatever needs it.
changing the node IPs in place is not supported and it's unlikely there will be tooling for such a migration any time soon.

this is not only a kubeadm problem as the kubelet has --node-ip and the kube-apiserver has --etcd-servers which are IP only options that are commonly used too.

We would like to avoid this:

for kubeadm you could do this as a workaround:

  • sign custom certificates before calling kubeadm init / join (kubeadm would just accept them)
  • use --experimental-patches to add flags in etcd.yaml in the container command. flags with the same name will override flags previous flags with that name in the command line - e.g. --listen-peer-urls.

but again, this is too complicated and unsupported to have a magical fix for.

@FerminCastro
Copy link

"your best option is to add nodes from a new network and remove nodes from the old network. then only swap the LB endpoint. patch coredns / CNI whatever needs it."

Thanks but there are many use cases where this is just not possible. We may be in a totally different system where we want to place the exact same K8 config we had in a test environment. Or move it to a different DC where we have aliased the hostnames properly. This is a problem that was solved for practically every type of IT system long time ago... Do not attach your infrastructure and apps to any specific IPs. great that the app layer in K8 is allowing that, but the control plane itself still does NOT.

@neolit123
Copy link
Member

neolit123 commented Mar 4, 2021

like i've mentioned, you can complain to https://github.com/kubernetes/community/tree/master/sig-architecture#contact
the mailing list includes key Kubernetes maintainers.

for kubeadm we could adapt things like this to be IP or DNS name:
https://pkg.go.dev/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#InitConfiguration
https://pkg.go.dev/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#APIEndpoint

which will allow you to have a DNS name in certs and the etcd.yaml, but the problem here is the kube-apiserver.
the same value is used for:

--bind-address ip     Default: 0.0.0.0
--
  | The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces will be used.

https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

there are a number of settings in k8s that are IP only.

@FerminCastro
Copy link

Thanks, this is something that definitely needs to be revisited and fixed. We are finding ourselves having to move pretty complex control plane configurations to other locations and these configurations (control plane things like spread topologies, labels etc) should all be totally portable (i.e. their etcd snapshot) provided we keep hostnames consistent...

@FerminCastro
Copy link

By the way, and stickig ONLy to kubeadm. Maybe you can helps us describing how the coredns secret:

[opc@olk8-m1 ~]$ k get secret -A | grep coredns
kube-system coredns-token-h4l7g kubernetes.io/service-account-token 3 11m

is created by kubeadm. Whenever we restore a etcd snapshot on a different node (with the same hostanme) we are forced to entirely redeploy cordens

sudo kubeadm init phase addon coredns

because the coredns pods keep failing with "0305 09:26:54.910414 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Service: Unauthorized"

messages. But if we delete the coredns deployment and the secret and recreate it with kubeadm init phase addon coredns, things work (unfortunately this is a pain because we use spread topology for coredns and we need to apply it again after each restore)

Thanks for the help

@neolit123
Copy link
Member

is created by kubeadm. Whenever we restore a etcd snapshot on a different node (with the same hostanme) we are forced to entirely redeploy cordens

i have not explanation for this, but it's not a kubeadm quirk.
requiring to delete coredns and related objects after etcd snapshot restore almost sounds like a service account got invalidated somehow.

maybe someone at #sig-api-machinery or #sig-auth at k8s slack knows why.
you can log a separate ticket for that here, if you think it's a core bug.

the kubeadm coredns related objects are in this file:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/addons/dns/manifests.go

@FerminCastro
Copy link

Not so sure that this is not a kubeadm issue. Because the original coredns was created with kubeadm. So somehow the secret and cluster role Binding get invalidated in the second location so seems like kubeadm is generating those with some inappropriate dependency. Again, it would help tremendously finding out how kubeadm is generating the secret and cluster role binding for coredns

@neolit123
Copy link
Member

neolit123 commented Mar 5, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

4 participants