Skip to content

Latest commit

 

History

History
341 lines (249 loc) · 11.6 KB

notes.org

File metadata and controls

341 lines (249 loc) · 11.6 KB

Network setup: Investigate (from recent openwrt release notes)

  • RFC 7788 - Home Networking Control Protocol
  • RFC 7084 - Basic Requirements for IPv6 Customer Edge Routers

See also https://typhoon.psdn.io/bare-metal/ and https://github.com/kubermesh/kubermesh

Consider https://github.com/coreos/torcx for storing local “patches” to reapply to an otherwise-immutable base image.

Configure host sshd to accept certs signed by k8s CA. Do short-term signatures, just for individual debugging exercises (12h?)

TODO:

Install

On master

NB: local DNS preconfigured with `kube.lan` -> 192.168.0.9

ip link add anycast0 type dummy || : ip addr replace 192.168.0.9/32 dev anycast0

kubeadm init \ –node-name=$(cat /etc/machine-id) \ –pod-network-cidr=10.244.0.0/16 \ –apiserver-cert-extra-sans=kube.lan,kube.oldmacdonald.farm \ –apiserver-advertise-address=192.168.0.9 –token-ttl=12h \ –feature-gates=SelfHosting=true

Need to hack in other ssh session (fixed upstream maybe?): sed -i ‘s/initialDelaySeconds: [0-9]\+/initialDelaySeconds: 180/’ /etc/kubernetes/manifests/kube-apiserver.yaml

Go to k8s config section.

On nodes (containos):

Go back to kubeadm 1.12.x (yes, that’s right - this was the last version before they required reading the ‘kube-proxy’ configmap as part of node join, via the ability to read all kube-system configmaps as node bootstrapper)

#RELEASE=”$(curl -sSL https://dl.k8s.io/release/stable.txt)” RELEASE=v1.12.5 curl -L –remote-name-all https://dl.k8s.io/release/${RELEASE}/bin/linux/amd64/kubeadm

Get new token: (with cluster-admin, this works with current kubeadm) `KUBECONFIG=$HOME/.kube/homeconf kubeadm token create –ttl 12h –print-join-command` (Note! kubeadm interprets key/cert relative paths incorrectly, so needs to be run from ~/.kube/)

kubeadm join \ –node-name=$(cat /etc/machine-id) \ –discovery-token-ca-cert-hash=sha256:$certhash \ –token $token kube.lan:6443 \ –ignore-preflight-errors=all

bananapi fyi:

Can flash leds to identify a physical machine: `cat /sys/class/leds/bananapi:green:usr/trigger` shows values. echo heartbeat > /sys/class/leds/bananapi:green:usr/trigger echo none > /sys/class/leds/bananapi:green:usr/trigger

On nodes (flatcar):

PXE boot into flatcar (on ramdisk). wget http://kube.lan:31069/pxe-config.ign sudo flatcar-install -d /dev/sda -C beta -i pxe-config.ign reboot

docker run –rm -it \ -v /etc:/rootfs/etc \ -v /opt:/rootfs/opt \ -v /usr/bin:/rootfs/usr/bin \ -e K8S_VERSION=v1.7.7 \ xakra/kubeadm-installer coreos

PATH=$PATH:/opt/bin sudo kubeadm join \ –node-name=$(cat /etc/machine-id) \ –discovery-token-ca-cert-hash=sha256:$certhash \ –token $token kube.lan:6443

K8s config

Edit daemonset/self-hosted-kube-apiserver to set `–etcd-quorum-read=true` (quorum=false should not exist, grumble, grumble)

scp root@kube.lan:/etc/kubernetes/admin.conf /tmp/kubeconfig ./push.sh

Self-hosted master-reboot recovery

(on master)

Will boot up, run etcd (from /e/k/manifests), and then sit.

kubeadm alpha phase controlplane all \ –pod-network-cidr=10.244.0.0/16 \ –apiserver-advertise-address=192.168.0.9

Wait for control plane to come up. Will read etcd and start up self-hosted control jobs. self-hosted jobs will crash-loop because addresses/locks are in use. When this is happening:

rm /etc/kubernetes/manifests/kube-{apiserver,controller-manager,scheduler}.yaml

HA migration

Set up single (self-hosted) master using kubeadm as usual.

Join new node

Join new (potential master) node as normal: kubeadm join \ –node-name=$(cat /etc/machine-id) \ –discovery-token-ca-cert-hash=sha256:$hash \ –token $token kube.lan:6443

Promote to master role:

kubectl taint node $node node-role.kubernetes.io/master=:NoSchedule kubectl label node $node node-role.kubernetes.io/master=

Secure/expose etcd

Set up CA cert, and signed server+peer certs for (at least) existing and new etcd node, and client certs for apiserver. NB: existing (kubeadm) server will have etcd name “default”.

On existing (kubeadm) master:

docker run –net=host –rm -e ETCDCTL_API=3 -ti \ gcr.io/google_containers/etcd-arm:3.1.10 /bin/sh etcdctl member list etcdctl member update $memberID https://$ip:2380

Install certs and modify /etc/kubernetes/manifests/etcd.yaml to add: env:

  • name: POD_IP valueFrom: fieldRef: fieldPath: status.hostIP

command:

  • –advertise-client-urls=https://$(POD_IP):2379
  • –listen-client-urls=http://127.0.0.1:2379,https://$(POD_IP):2379
  • –cert-file=/keys/etcd-kmaster1-server.pem
  • –key-file=/keys/etcd-kmaster1-server-key.pem
  • –peer-cert-file=/keys/etcd-kmaster1-peer.pem
  • –peer-key-file=/keys/etcd-kmaster1-peer-key.pem
  • –peer-client-cert-auth
  • –peer-trusted-ca-file=/keys/etcd-ca-peer.pem
  • –listen-peer-urls=https://$(POD_IP):2380

volumeMounts:

  • mountPath: /keys name: keys

volumes:

  • hostPath: path: /etc/kubernetes/pki type: Directory name: keys

Run etcd on new node

Copy etcd TLS keys into etc/kubernetes/pki

Copy manifests/etcd.yaml to new node, modify ETCD_NAME and key paths. (will crashloop until next step)

On existing master: docker run –net=host -e ETCDCTL_API=3 –rm -ti \ gcr.io/google_containers/etcd-arm:3.1.10 \ etcdctl member add kmaster2 –peer-urls=https://192.168.0.140:2380

On new (empty) additional master:

Copy /etc/kubernetes/pki/ca.key over to new machine(s)

ETCD_NAME=kmaster3; POD_IP=192.168.0.128; docker run –rm –net=host -v /var/lib/etcd:/var/lib/etcd -v /etc/kubernetes/pki:/keys gcr.io/google_containers/etcd-arm:3.0.17 etcd –advertise-client-urls=https://${POD_IP}:2379 –data-dir=/var/lib/etcd –listen-client-urls=http://127.0.0.1:2379,https://${POD_IP}:2379 –initial-cluster=default=https://192.168.0.9:2380,${ETCD_NAME}=https://${POD_IP}:2380 –initial-advertise-peer-urls=https://${POD_IP}:2380 –initial-cluster-state=existing –cert-file=/keys/etcd-${ETCD_NAME}-server.pem –key-file=/keys/etcd-${ETCD_NAME}-server-key.pem –peer-cert-file=/keys/etcd-${ETCD_NAME}-peer.pem –peer-key-file=/keys/etcd-${ETCD_NAME}-peer-key.pem –peer-client-cert-auth –peer-trusted-ca-file=/keys/etcd-ca.pem –listen-peer-urls=https://${POD_IP}:2380 –client-cert-auth –trusted-ca-file=/keys/etcd-ca.pem –election-timeout=10000 –heartbeat-interval=1000

etcd care and feeding

New node

NB: remove old dead nodes before adding new nodes. See etcd FAQ for discussion.

Add to known members: kubectl -n kube-system edit configmap etcd kubectl -n kube-system exec $existing_etcd_pod – \ etcdctl member add $name –peer-urls=https://<pod ip>:2380

Replacement of failed node:

  • Careful when changing! StatefulSets don’t (yet) support updateStrategy.minAvailable, so 1x failed + 1x updating can lead to 2 down.
  • Delete the dead node, it isn’t coming back. kubectl delete node $node
  • Remove the dead member, it isn’t coming back. kubectl -n kube-system exec $existing_etcd_pod – \ etcdctl member list kubectl -n kube-system exec $existing_etcd_pod – \ etcdctl member remove $memberid
  • Promote learner/spare. kubectl -n kube-system exec $existing_etcd_pod – \ etcdctl member promote $learnerid
  • Update etcd k8s config to add a new etcd learner.
What I did:
  • Add to etcdMembers. Push. This updates certificate+secret, and nodeSelector. (good)

    .. And also changes etcd command line, which leads to 2x etcd down 😱 Would a StatefulSet partition (of zero nodes) have helped here?

  • Copy checkpointed manifest+secrets off the one remaining etcd, hack podspec to match mistakenly killed node, and use to bring back 2x working replicas. Will need to repeat occasionally, until checkpoints converge.
  • etcdctl remove/add new member. etcd scheduled on new node, and (after etcdctl membership change) came up. This worked well.
  • statefulset continued the semantic-noop update of remaining pods with the new initial-cluster flag value.
Next time?
  • Delete the dead node first. It’s not coming back, let everything else failover/restart as expected early in the process.
  • Do the etcdctl membership remove/add first, while 2 nodes are up. Again, the dead node isn’t coming back.
  • Try the partition thing. Basically want to update cert + expand nodeSelector but under no circumstances restart existing/healthy etcd peer.

Change hostnames to use symbolic etcd-[0-2] names? apiserver still needs to know IPs - or a hostname? initial cluster peers need IPs too. needs to match current state to avoid update. another hostname?

Aha! Use a learner as a ‘warm spare’. Run replicas=4, but one of the members is a ‘learner’ according to etcd. After failure, we can promote the learner to full member with etcdctl command (no k8s changes required), and get back to 3 nodes. Then can use regular k8s operations to replace node and schedule a new learner (ok, even if that replacement takes temporarily takes an etcd node down since we’re now back to 3 full members).

Disaster Recovery

etcd
  1. Copy static manifests around from (hopefully) a remaining good node.
  2. Get etcd up. No point fixing anything else until this happens.
apiserver

Regenerate static manifest: kubecfg show bootstrap.jsonnet -o json

d=/etc/kubernetes/checkpoint-secrets/kube-system/etcd-2/etcd-peer ETCDCTL_API=3 etcdctl \ –endpoints https://192.168.0.161:2379 \ –cacert=$d/ca.crt –key=$d/tls.key –cert=$d/tls.crt \ member list

Upgrade

kubeadm binaries available from https://dl.k8s.io/release/$release/bin/linux/$arch/kubeadm

NB: control jobs first, then kubelets Also: ensure to regenerate/rotate keys as part of upgrade - they have a 6month expiry.

v1.9 upgrade:

stash kubeadm-arm-v1.9.10 locally in ipfs: ipfs add https://dl.k8s.io/release/v1.9.10/bin/linux/arm/kubeadm QmSdVUeRF5QkSDZAd4sNMoH7AYANpXa4J9ME3TMQu8tVgh

On a master: Fetch kubeadm binary to /var/lib ./kubeadm-v1.9.10 upgrade apply –feature-gates SelfHosting=true v1.9.10

  • Upgrade etcd image to 3.1.11

v1.10 upgrade:

kubeadm-arm-v1.10.12: QmSboULs6WEs9Q2R1HV21HRAWmbUNRkS9cvJMnRuvU5xfz

v1.11 upgrade:

Cert renewal

Remove (backup) {apiserver,apiserver-kubelet-client,front-proxy-client}.{crt,key} Regen (run make) Copy out to nodes scp {apiserver,apiserver-kubelet-client,front-proxy-client}.{crt,key} $node:/etc/kubernetes/pki/

Regular upgrade

note: CoreDNS replaces kube-dns as the default DNS provider

note: Clusters still using Heapster for autoscaling should be migrated over to metrics-server and the custom metrics API

note: kube-proxy IPVS (note graceful termination is in v1.13)

note: sysctl support is now considered beta

kubeadm-arm-v1.11.10: QmaZ5cnybq5jPdjGk4Anght9xSoch1ExFfm2JQu49NBDPw

  1. update jsonnet kube-system manifests
  2. ./push.sh
  3. yolo manual delete of kube-dns svc, bring up coredns svc, check, delete kube-dns deploy
  4. ./coreos-update-all.sh

cluster-admin cert expiry

Now that cert-manager is rotating cluster-admin certs, they can expire before I remember to grab an updated copy. Oops.

On an etcd node: sudo etcdctl get /registry/secrets/kube-system/kubernetes-admin \ –cacert=/etc/kubernetes/checkpoint-secrets/kube-system/etcd-0/etcd-monitor/ca.crt \ –cert=/etc/kubernetes/checkpoint-secrets/kube-system/etcd-0/etcd-monitor/tls.crt \ –key=/etc/kubernetes/checkpoint-secrets/kube-system/etcd-0/etcd-monitor/tls.key