Skip to content

Commit

Permalink
Merge pull request #98 from ricsanfre/argocd
Browse files Browse the repository at this point in the history
Feature/Argocd
  • Loading branch information
ricsanfre committed Jan 29, 2023
2 parents d7eca90 + 9835119 commit ff2c414
Show file tree
Hide file tree
Showing 327 changed files with 15,063 additions and 12,092 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,5 @@ jobs:
run: pip3 install yamllint

- name: Lint all the YAMLs.
working-directory: ./ansible
run: yamllint .
9 changes: 5 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
roles/ricsanfre.*
ansible_collections
certificates
docs/_site
/ansible/roles/ricsanfre.*
/ansible/ansible_collections
/certbot
/certificates
/docs/_site
92 changes: 92 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
.EXPORT_ALL_VARIABLES:

GPG_EMAIL=ricsanfre@gmail.com
GPG_NAME=Ricardo Sanchez

.PHONY: default
default: clean

.PHONY: prepare-ansible
prepare-ansible: install-ansible-requirements gpg-init ~/.vault/vault_passphrase.gpg ansible-credentials

.PHONY: clean
clean: k3s-reset external-services-reset

.PHONY: init
init: os-upgrade gateway-setup nodes-setup external-services configure-os-backup k3s-install k3s-bootstrap configure-monitoring-gateway

.PHONY: install-ansible-requirements
install-ansible-requirements: # install Ansible requirements
cd ansible && ansible-galaxy install -r requirements.yml

.PHONY: install-ansible-requirements-force
install-ansible-requirements-force: # install Ansible requirements
cd ansible && ansible-galaxy install -r requirements.yml --force

.PHONY: gpg-init
gpg-init:
scripts/generate_gpg_key.sh

~/.vault/vault_passphrase.gpg: # Ansible vault gpg password
mkdir -p ~/.vault
pwgen -n 71 -C | head -n1 | gpg --armor --recipient ${GPG_EMAIL} -e -o ~/.vault/vault_passphrase.gpg

.PHONY: ansible-credentials
ansible-credentials: ~/.vault/vault_passphrase.gpg install-ansible-requirements
cd ansible && ansible-playbook create_vault_credentials.yml

.PHONY: os-upgrade
os-upgrade:
cd ansible && ansible-playbook update.yml

.PHONY: gateway-setup
gateway-setup:
cd ansible && ansible-playbook setup_picluster.yml --tags "gateway"

.PHONY: nodes-setup
nodes-setup:
cd ansible && ansible-playbook setup_picluster.yml --tags "nodes"

.PHONY: external-services
external-services:
cd ansible && ansible-playbook external_services.yml

.PHONY: configure-os-backup
configure-os-backup:
cd ansible && ansible-playbook backup_configuration.yml

.PHONY: configure-monitoring-gateway
configure-monitoring-gateway:
cd ansible && ansible-playbook deploy_monitoring_agent.yml

.PHONY: os-backup
os-backup:
cd ansible && ansible -b -m shell -a 'systemctl start restic-backup' raspberrypi

.PHONY: k3s-install
k3s-install:
cd ansible && ansible-playbook k3s_install.yml

.PHONY: k3s-bootstrap
k3s-bootstrap:
cd ansible && ansible-playbook k3s_bootstrap.yml

.PHONY: k3s-reset
k3s-reset:
cd ansible && ansible-playbook k3s_reset.yml

.PHONY: external-services-reset
external-services-reset:
cd ansible && ansible-playbook reset_external_services.yml

.PHONY: shutdown-k3s-worker
shutdown-k3s-worker:
cd ansible && ansible -b -m shell -a "shutdown -h 1 min" k3s_worker

.PHONY: shutdown-k3s-master
shutdown-k3s-master:
cd ansible && ansible -b -m shell -a "shutdown -h 1 min" k3s_master

.PHONY: shutdown-gateway
shutdown-gateway:
cd ansible && ansible -b -m shell -a "shutdown -h 1 min" gateway
195 changes: 188 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,206 @@
</tr>
</table>

## **K3S Kubernetes Cluster using bare metal ARM-based nodes (Raspberry-PIs) automated with Ansible**
**K3S Kubernetes Cluster using bare metal ARM-based nodes (Raspberry-PIs) automated with Ansible and ArgoCD**

This is an educational project to explore kubernetes cluster configurations using an ARM architecture and its automation using Ansible.
This is an educational project to explore kubernetes cluster configurations using an ARM architecture and apply IaC (Infrastructure as Code) and GitOps methodologies to automate its provisioning and management.

The entire process for creating this cluster at home, from cluster design and architecture to step-by-step manual configuration guides, has been documented and it is published in the project website: https://picluster.ricsanfre.com.

This repository contains the Ansible's source code (playbooks/roles) and Cloud-init's configuration files used for automated all manual tasks described in the documentation.
The cluster can be re-deployed in minutes as many times as needed for testing new cluster configurations, new software versions or just take you out of any mesh you could cause playing with the cluster.
This repository contains all source code used to automate all manual tasks described in the documentation: Cloud-init's configuration files, Ansible's source code (playbooks/roles), and packaged Kubernetes applications (helm and kustomize) to be deployed using ArgoCD.

Since its deployment is completely automated, the cluster can be re-deployed in minutes as many times as needed for testing new cluster configurations, new software versions or just take you out of any mesh you could cause playing with the cluster.

## Scope

Automatically deploy and configure a lightweight Kubernetes flavor based on [K3S](https://k3s.io/) and deploy cluster basic services such as: 1) distributed block storage for POD's persistent volumes, [LongHorn](https://longhorn.io/), 2) backup/restore solution for the cluster, [Velero](https://velero.io/) and [Restic](https://restic.net/), 3) service mesh architecture, [Linkerd](https://linkerd.io/), and 4) observability platform based on metrics monitoring solution, [Prometheus](https://prometheus.io/), logging and analytics solution, EFḰ+LG stack ([Elasticsearch](https://www.elastic.co/elasticsearch/)-[Fluentd](https://www.fluentd.org/)/[Fluentbit](https://fluentbit.io/)-[Kibana](https://www.elastic.co/kibana/) + [Loki](https://grafana.com/oss/loki/)-[Grafana](https://grafana.com/oss/grafana/)), and distributed tracing solution, [Tempo](https://grafana.com/oss/tempo/).
The scope of this project is to create a kubernetes cluster at home using **Raspberry Pis** and to automate its deployment and configuration applying **IaC (infrastructure as a code)** and **GitOps** methodologies with tools like [Ansible](https://docs.ansible.com/), [cloud-init](https://cloudinit.readthedocs.io/en/latest/) and [Argo CD](https://argo-cd.readthedocs.io/en/stable/).

As part of the project, the goal is to use a lightweight Kubernetes flavor based on [K3S](https://k3s.io/) and deploy cluster basic services such as: 1) distributed block storage for POD's persistent volumes, [LongHorn](https://longhorn.io/), 2) backup/restore solution for the cluster, [Velero](https://velero.io/) and [Restic](https://restic.net/), 3) service mesh architecture, [Linkerd](https://linkerd.io/), and 4) observability platform based on metrics monitoring solution, [Prometheus](https://prometheus.io/), logging and analytics solution, EFḰ+LG stack ([Elasticsearch](https://www.elastic.co/elasticsearch/)-[Fluentd](https://www.fluentd.org/)/[Fluentbit](https://fluentbit.io/)-[Kibana](https://www.elastic.co/kibana/) + [Loki](https://grafana.com/oss/loki/)-[Grafana](https://grafana.com/oss/grafana/)), and distributed tracing solution, [Tempo](https://grafana.com/oss/tempo/).

## Technology Stack

The following picture shows the set of opensource solutions used so far in the cluster, which installation process has been documented and its deployment has been automated with Ansible:
The following picture shows the set of opensource solutions used so far in the cluster, which installation process has been documented and its deployment has been automated with Ansible/ArgoCD:

<p align="center">
<img src="docs/assets/img/pi-cluster-icons.png" width="500"/>
</p>

<div class="d-flex">
<table class="table table-white table-borderer border-dark w-auto align-middle">
<tr>
<th></th>
<th>Name</th>
<th>Description</th>
</tr>
<tr>
<td><img width="32" src="https://simpleicons.org/icons/ansible.svg"></td>
<td><a href="https://www.ansible.com">Ansible</a></td>
<td>Automate OS configuration, external services installation and k3s installation and bootstrapping</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/argo/icon/color/argo-icon-color.svg"></td>
<td><a href="https://argoproj.github.io/cd">ArgoCD</a></td>
<td>GitOps tool for deploying applications to Kubernetes</td>
</tr>
<tr>
<td><img width="32" src="https://cloud-init.github.io/images/cloud-init-orange.svg"></td>
<td><a href="https://cloudinit.readthedocs.io/en/latest/">Cloud-init</a></td>
<td>Automate OS initial installation</td>
</tr>
<tr>
<td><img width="32" src="https://assets.ubuntu.com/v1/ce518a18-CoF-2022_solid+O.svg"></td>
<td><a href="https://ubuntu.com/">Ubuntu</a></td>
<td>Cluster nodes OS</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/k3s/icon/color/k3s-icon-color.svg"></td>
<td><a href="https://k3s.io/">K3S</a></td>
<td>Lightweight distribution of Kubernetes</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/containerd/icon/color/containerd-icon-color.svg"></td>
<td><a href="https://containerd.io/">containerd</a></td>
<td>Container runtime integrated with K3S</td>
</tr>
<tr>
<td><img width="20" src="https://raw.githubusercontent.com/flannel-io/flannel/master/logos/flannel-glyph-color.svg"></td>
<td><a href="https://github.com/flannel-io/flannel">Flannel</a></td>
<td>Kubernetes Networking (CNI) integrated with K3S</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/coredns/icon/color/coredns-icon-color.svg"></td>
<td><a href="https://coredns.io/">CoreDNS</a></td>
<td>Kubernetes DNS</td>
</tr>
<tr>
<td><img width="32" src="https://metallb.universe.tf/images/logo/metallb-blue.png"></td>
<td><a href="https://metallb.universe.tf/">Metal LB</a></td>
<td>Load-balancer implementation for bare metal Kubernetes clusters</td>
</tr>
<tr>
<td><img width="32" src="https://landscape.cncf.io/logos/traefik.svg"></td>
<td><a href="https://traefik.io/">Traefik</a></td>
<td>Kubernetes Ingress Controller</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/linkerd/icon/color/linkerd-icon-color.svg"></td>
<td><a href="https://linkerd.io/">Linkerd</a></td>
<td>Kubernetes Service Mesh</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/longhorn/icon/color/longhorn-icon-color.svg"></td>
<td><a href="https://longhorn.io/">Longhorn</a></td>
<td>Kubernetes distributed block storage</td>
</tr>
<tr>
<td><img width="60" src="https://min.io/resources/img/logo.svg"></td>
<td><a href="https://min.io/">Minio</a></td>
<td>S3 Object Storage solution</td>
</tr>
<tr>
<td><img width="32" src="https://landscape.cncf.io/logos/cert-manager.svg"></td>
<td><a href="https://cert-manager.io">Cert-manager</a></td>
<td>TLS Certificates management</td>
</tr>
<tr>
<td><img width="32" src="https://simpleicons.org/icons/vault.svg"></td>
<td><a href="https://www.vaultproject.io/">Hashicorp Vault</a></td>
<td>Secrets Management solution</td>
</tr>
<tr>
<td><img width="32" src="https://landscape.cncf.io/logos/external-secrets.svg"></td>
<td><a href="https://external-secrets.io/">External Secrets Operator</a></td>
<td>Sync Kubernetes Secrets from Hashicorp Vault</td>
</tr>
<tr>
<td><img width="60" src="https://velero.io/img/Velero.svg"></td>
<td><a href="https://velero.io/">Velero</a></td>
<td>Kubernetes Backup and Restore solution</td>
</tr>
<tr>
<td><img width="32" src="https://github.com/restic/restic/raw/master/doc/logo/logo.png"></td>
<td><a href="https://restic.net/">Restic</a></td>
<td>OS Backup and Restore solution</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/prometheus/icon/color/prometheus-icon-color.png"></td>
<td><a href="https://prometheus.io/">Prometheus</a></td>
<td>Metrics monitoring and alerting</td>
</tr>
<tr>
<td><img width="32" src="https://cncf-branding.netlify.app/img/projects/fluentd/icon/color/fluentd-icon-color.png"></td>
<td><a href="https://www.fluentd.org/">Fluentd</a></td>
<td>Logs forwarding and distribution</td>
</tr>
<tr>
<td><img width="60" src="https://fluentbit.io/images/logo.svg"></td>
<td><a href="https://fluentbit.io/">Fluentbit</a></td>
<td>Logs collection</td>
</tr>
<tr>
<td><img width="32" src="https://github.com/grafana/loki/blob/main/docs/sources/logo.png?raw=true"></td>
<td><a href="https://grafana.com/oss/loki/">Loki</a></td>
<td>Logs aggregation</td>
</tr>
<tr>
<td><img width="32" src="https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt36f2da8d650732a0/5d0823c3d8ff351753cbc99f/logo-elasticsearch-32-color.svg"></td>
<td><a href="https://www.elastic.co/elasticsearch/">Elasticsearch</a></td>
<td>Logs analytics</td>
</tr>
<tr>
<td><img width="32" src="https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt4466841eed0bf232/5d082a5e97f2babb5af907ee/logo-kibana-32-color.svg"></td>
<td><a href="https://www.elastic.co/kibana/">Kibana</a></td>
<td>Logs analytics Dashboards</td>
</tr>
<tr>
<td><img width="32" src="https://grafana.com/static/assets/img/logos/grafana-tempo.svg"></td>
<td><a href="https://grafana.com/oss/tempo/">Tempo</a></td>
<td>Distributed tracing monitoring</td>
</tr>
<tr>
<td><img width="32" src="https://grafana.com/static/img/menu/grafana2.svg"></td>
<td><a href="https://grafana.com/oss/grafana/">Grafana</a></td>
<td>Monitoring Dashboards</td>
</tr>
</table>
</div>

## External Resources and Services

Even whe the premise is to deploy all services in the kubernetes cluster, there is still a need for a few external services/resources. Below is a list of external resources/services and why we need them.

### Cloud external services


| |Provider | Resource | Purpose |
| --- | --- | --- | --- |
| <img width="60" src="https://letsencrypt.org/images/letsencrypt-logo-horizontal.svg" >| [Letsencrypt](https://letsencrypt.org/) | TLS CA Authority | Signed valid TLS certificates |
| <img width="60" src="https://www.ionos.de/newsroom/wp-content/uploads/2022/03/LOGO_IONOS_Blue_RGB-1.png"> |[IONOS](https://www.ionos.es/) | DNS | DNS and [DNS-01 challenge](https://letsencrypt.org/docs/challenge-types/#dns-01-challenge) for certificates |

> **NOTE:** These resources are optional, the homelab still works without them but it won't have trusted certificates
**Alternatives:**

1. Use a private PKI (custom CA to sign certificates).

Currently supported. Only minor changes are required. See details in [Doc: Quick Start instructions](https://picluster.ricsanfre.com/docs/ansible).

2. Use other DNS provider.

Cert-manager / Certbot used to automatically obtain certificates from Let's Encrypt can be used with other DNS providers. This will need further modifications in the way cert-manager application is deployed (new providers and/or webhooks/plugins might be required).

Currently only acme issuer (letsencytp) using IONOS as dns-01 challenge provider is configured. Check list of [supported dns01 providers](https://cert-manager.io/docs/configuration/acme/dns01/#supported-dns01-providers).

### Self-hosted external services

There is another list of services that I have decided to run outside the kuberentes cluster but not using any cloud service. These services currently are running on the same cluster nodes (gateway and node1), but as baremetal service.

| |External Service | Resource | Purpose |
| --- | --- | --- | --- |
| <img width="60" src="https://min.io/resources/img/logo.svg"> |[Minio](https://mini.io) | S3 Object Store | Cluster Backup |
| <img width="32" src="https://simpleicons.org/icons/vault.svg"> |[Hashicorp Vault](https://www.vaultproject.io/) | Secrets Management | Cluster secrets management |


## Cluster architecture and hardware

Home lab architecture, showed in the picture below, consist of a Kubernetes cluster of 5 nodes (1 master and 4 workers) and a firewall, built with another Raspberry PI, to isolate cluster network from your home network.
Expand All @@ -47,7 +228,7 @@ The content of this website and the source code to build it (Jekyll static based

## Usage

Check out the documentation [Quick Start guide](http://picluster.ricsanfre.com/docs/ansible/) to know how to use and tweak cloud-init files (`/cloud-init` folder) and Ansible playbooks contained in this repository.
Check out the documentation [Quick Start guide](http://picluster.ricsanfre.com/docs/ansible/) to know how to use and tweak cloud-init files (`/cloud-init` folder), Ansible playbooks (`/ansible` folder) and packaged Kubernetes applications ( `/argocd` folder) contained in this repository, so you can use in for your own homelab.

## About the Project

Expand Down
2 changes: 2 additions & 0 deletions ansible/.vault/vault_pass.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/sh
gpg --batch --use-agent --decrypt $HOME/.vault/vault_passphrase.gpg
File renamed without changes.
2 changes: 2 additions & 0 deletions ansible.cfg → ansible/ansible.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,5 @@ collections_path = ./
host_key_checking = false
# SSH key
private_key_file = $HOME/.ssh/ansible-ssh-key.pem
# Vault password
vault_password_file=./.vault/vault_pass.sh
24 changes: 24 additions & 0 deletions ansible/backup_configuration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---

- name: Configure Pi-cluster nodes backup
hosts: raspberrypi
gather_facts: true
tags: [backup]
become: true
pre_tasks:
- name: Include vault variables
include_vars: "vars/vault.yml"
# Include picluster variables
- name: Include picluster variables
include_vars: "vars/picluster.yml"
- name: Load CA certificate for restic
set_fact:
restic_ca_cert: "{{ lookup('file','certificates/CA.pem') }}"
when: not enable_letsencrypt
- name: Do not use CA certificate
set_fact:
restic_use_ca_cert: false
when: enable_letsencrypt
roles:
- role: ricsanfre.backup
tags: [backup]

0 comments on commit ff2c414

Please sign in to comment.