FAF K8S config

This repository implements the FAF architecture in Kubernetes. It’s intended to supersede the current FAF Docker-Compose stack.

Requirements & tools

k3s or k3d installed
- The main node must run with some arguments:
  --node-label=storage-id=main-01 (determines the main storage node)
jq installed (required for scripts)
Recommended: A k8s ui such as Lens (GUI) or k9s (CLI)

Motivation

Immediate goals

Allow more people access to logs, configuration and deployments for certain services without giving them full server access.
- Role based access control
- Direct cluster access via kubectl for all authorized developers
- Improved configuration and secret management
Advanced resource controls thanks to cpu and ram limits (no app shall consume all CPU ever again)
Easier debugging on test environment due to port-forwarding of pods without compromising production config

Long-term goals

Better integration into CI pipelines with automated deployments

Non-Goals

We do not move to k8s because it is cool and fancy!
- K8s has much more complexity compared to our docker-compose stack
- We would avoid it, if docker-compose could solve our problems (which it can’t).
- It was a very conscious trade-off decision.
We do not move to k8s because we want to deploy FAF on a managed cloud provider.
- Cloud providers are super expensive. We’d have nothing to gain here.
We do not move to k8s become highly available!
- High availability only works if all components are highly available. Most of our apps are not built in that way at all.
- Deployments with less downtime might be a benefit for some services.

Decision Log

Distribution selection

We’ll use k3s.
- It is fully supported by NixOS and is a simplified distribution which should be easier to maintain.
- It also runs on developer machines.
- It uses few resources.
Running the same distribution on prod and on local machines makes things more predictable and scripts more stable.
- Minikube should be mostly compatible, if some devs insist on using it

Volume management

We’ll run with manual managed persistent volumes and claims, because we need predictable paths.
Predictable paths are a necessity for managing the volumes with ZFS.
Using k3s local-path-provisioner we can define the prefix (in the configmap local-path-config) and the suffix (in the mounting options in the pod), but in between these there is a random uuid we can’t know beforehand.
This breaks predefined setups and scripts.
K8s builtin local-path with node affinity ensures all data of a volume can be stored on a selected node (the node label with storage-id=main-01)

Traefik IngressRoute over default Ingress definitions

K3s comes with Traefik as Ingress controller by default.
The default Ingress controller in the outside world is nginx.
Traefik is well known to FAF since we use it as revers proxy in our faf-stack extensively
Traefik offers support for
- classic Ingress definitions, but requires ingress annotations to use more advanced features (similar to Traefik labels in our current docker-compose.yml)
- custom IngressRoute definitions which maps the exact Traefik feature set into a yaml format (no annotations required)

We have to select which resource type we use and we should stick to it consistently. As always it’s a tradeoff:

Pro classic Ingress
- Class ingress is stable by (not so long) now, while Traefik IngressRoutes are still marked as alpha (yet we use Traefik for quite a while and there were rarely changes even from 1.x to 2.x)
- Classic Ingress is well-known syntax and understood by most external K8s users. So the entry barrier for external contributions is lower. However a lot of functionality would hide behind the Traefik annotations which would still need people to learn it to understand it all.
- Using classic ingress would allow us to swap out Traefik anytime and still have a mostly working setup
Pro Traefik IngressRoute
- We (the FAF responsible Ops guys) see Traefik as superior compared to Nginx (and moved from Nginx to Traefik as reversy proxy years ago)
  - Thus we do not expect moving back
- We have an existing stack we need to migrate 1:1
- Since we use Traefik features anyway using the IngressRoute reduces the overall yaml complexity as we do not split logic and annotations
- Traefik syntax seems easier to understand than regular Ingress, so using Traefik syntax might lower the barrier for external contributors who never used classic Ingress.

Decision: We’ll use Traefik IngressRoutes.

Certificate management & Let’s encrypt

We could run for Traefik certificate resolvers or use cert-manager
Cert-Manager works with classic Ingress routes and Traefik specific IngressRoutes
- Needs additional software
- Has a short support cycle (6 months per point release)
- ⇒ More maintenance overhead
Traefik internal let’s encrypt resolver needs to be manually configured on the node
- It stores certificates somewhere on disk
- The easiest approach is a persistent volume on the main storage node
  - This effectively restricts Traefik to run on a single node
- More sophisticate approach is storing the certificates in a persistent remote / network volume
- Once we have full Cloudflare access, we can do Cloudflare DNS challenge using a Cloudflare token. Then Traefik does not need to issue one certificate per subdomain. It’s unclear though if this makes persisting the certificate obsolete.

Decision: We’ll use Traefik as long as we don’t run into any problems, since it seems less maintenance buurden. Cert-manager can still be introduced later if required.

RabbitMQ

For RabbitMQ there are 3 potential ways of implementing:

Manually define a single-node statefulset as a 1:1 copy of faf-stack.
A Helm chart from Bitnami
Deploying the RabbitMQ operator

Decision: We’ll run for the Bitnami helm charts. It is really awesome configurable so that it can read our secrets, so the template can be perfectly configure. This simplifies coding compared to a manual stefulset. The RabbitMQ operator seems much more complex for now.

User access and RBAC

We want to give access to multiple people with potentially different permissions.
Handing out service account certificates is quite annoying.
An SSO login via OIDC is preferred and supported by K8s / K3s.
- The preferred identity provider would be Github as all developers are there and its outside the system itself. Unfortunately Gitlab only supports OAuth2 and not OIDC.
- Google accounts would be an alternative, but we don’t want to force people on Google.
- We’ll use FAFs custom login instead.
- As a fallback (in case the FAF login is broken) we still have the main service account.
RBAC t.b.d.

Developer environment & reproducibility

No service shall go live if its initial configuration or installation can’t be scripted.
Everything must be runnable on a single-node cluster.
Scripts shall be idempotent / re-runnable without fatal consequences. We will use k8s annotations to keep track of the state.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
apps		apps
config.template		config.template
ingress		ingress
scripts		scripts
secrets.template		secrets.template
storage		storage
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
FINDINGS.adoc		FINDINGS.adoc
LICENSE		LICENSE
README.adoc		README.adoc
TODO.adoc		TODO.adoc

License

Brutus5000/k8s-config

Folders and files

Latest commit

History

Repository files navigation

FAF K8S config

Requirements & tools

Motivation

Immediate goals

Long-term goals

Non-Goals

Decision Log

Distribution selection

Volume management

Traefik IngressRoute over default Ingress definitions

Certificate management & Let’s encrypt

RabbitMQ

User access and RBAC

Developer environment & reproducibility

About

Topics

Resources

License

Stars

Watchers

Forks

Languages