Deploying an intelligent autoscaler into a Kubernetes cluster

The purpose of this repository is to train an agent with reinforcement learning so that it can outperform a standard Horizontal Pod Autoscaler (HPA) by allocating resources in a smarter way. This is done by looking at multiple metrics instead of solely relying on CPU utilisation.

The microservice architecture that it is used is built upon GoogleCloudPlatform/microservices-demo v0.5.2. Based on that repository, an RL agent has been developed to take scaling decisions and actually scale-out or -in the services of the Online Boutique when run in Kubernetes.

In the following, there is a guide to deploy the agent in a Google Kubernetes Engine (GKE) cluster and locally using minikube. The local development environment is used mainly for testing the interaction between the agent, the Prometheus API and the Kubernetes API before deploying it in the "production" environment in GKE. Indeed, the GKE cluster is used to check the actual behaviour of the algorithm in a more realistic environment, and compare the standard HPA with the intelligent autoscaler.

So take your cup of coffee or tea and let's begin! ☕️

How it works in a nutshell 🥜

In the cloud platform, I deployed the Kubernetes cluster, which consists of a Master node and multiple worker nodes, which will host all the pods needed by the microservice application Online Boutique. Each pod contains one or multiple Docker containers, represented by a container with rounded angles in the Figure. For example, the container RL agent contains the code to use the model on the cluster when trained, while Envoy represents the sidecar proxy injected by Istio on every pod, next to the Server container. Each Server represents a microservice packaged into a container, for example, productcatalog, frontend, etc. Each pod is assigned to a node by the kube-scheduler based on resource availability, and the whole Online Boutique application runs in the same namespace.

The communication between microservices occurs through the service mesh enabled by Istio. Specifically, Envoy proxies monitor each server and publish these data on an endpoint. Both Prometheus and Grafana are deployed as pods inside the cluster. Prometheus scrapes every 5s the metrics from Istio, which collects these from each Envoy proxy present in the cluster. These metrics are stored in the Prometheus time-series database and are scraped by Grafana and the RL agent. With Grafana, we can visualise the rate of change of each metric over time in a dashboard, and download this data in our local development environment to process it and make our measurements.

The scaling action is performed by the RL agent container based on the observed state of the cluster. In fact, the agent queries Prometheus to receive aggregated pod metrics, representing state $S_{t}$, and takes a scaling action $A_{t}$ accordingly, by calling the kube-api-server in the master node. Kubernetes is then responsible for executing the scaling action by increasing or decreasing the number of replicas in the target deployment.

Developing on a cloud platform ☁️

Here, the agent is deployed in a cloud environment that is similar to a real-world scenario where scalability, availability and security are the main concerns.

⚠️ Cloud computing comes with a cost 💸🏭

Using GKE, you will be charged for the resources you use. The main costs will occur for the following resources:

Compute Engine, for a VM instance

Filestore, for an NFS server instance

Kubernetes Engine, for nodes utilisation

Container Registry, for container images storage

Picture of the infrastructure.

Setting up all components

First things first, you have to create a project with your Google account on Google Cloud Platform (GCP) as indicated in the official documentation.

Next, the following components are needed to have everything working on GKE. Specifically, we will connect to the Kubernetes cluster with a virtual machine and manage our pods and the agent from there.

Virtual Machine (VM) as main workstation to connect to the Kubernetes cluster

This VM will be used to run commands like kubectl to interact with the cluster and git pull to get the latest changes from the repository when developing.

Create a VM instance with at least 2 vCPUs, 4GB of memory, 20GB of disk and Ubuntu as OS.
SSH to the machine from your local computer or from the browser. You can use gcloud compute ssh --zone "<your_zone>" "<vm_instance_name>" --project "<your_proect-name>" or directly from the user interface in GCP.

What to install in the Ubuntu VM

gcloud SDK: follow these instructions to install the SDK and then
1. Install the package to authenticate with GKE sudo apt-get install google-cloud-cli-gke-gcloud-auth-plugin
2. Install the Kubernetes command line if not already present with sudo apt-get install kubectl
GitHub CLI: follow these instructions to install gh, which will be necessary to authenticate to GitHub, push and pull changes to the repository. The complete command is reported below.

type -p curl >/dev/null || (sudo apt update && sudo apt install curl -y)
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
&& sudo chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& sudo apt update \
&& sudo apt install gh -y

Docker: follow these instructions to install Docker and sudo chmod 666 /var/run/docker.sock after the installation to provide the right permissions to access Docker socket.

Network Files System (NFS) server as shared persistent disk

A persistent disk that can be shared across all components is necessary because virtual machines and containers are by definition ephemeral and there is no guarantee that data will be available across sessions. An NFS server will be used to share storage between the VM and pods.

You can follow these guidelines to set up a NFS server with Google Filestore. On the VM:

Change directory and go into the mounted volume cd /mnt/<dir_name>
Create folder keys/ to store secret keys created in the following steps.
Clone this repository with git clone.

⚠️ If the VM is restarted you need to remount the volume (the address of the NFS server may have changed)

sudo mount <nfs_server_ip_address>:/vol1 /mnt/nfs-client sudo chmod go+rw /mnt/nfs-client

This volume will be claimed by certain pods using a Persistent Volume Claim.

Create a GKE cluster

You can follow the official documentation to set up your standard cluster. Nodes of type e2-standard-4 are sufficient with capacity to run the application and scale it.

Now it's time to authenticate into the cluster from the VM. Run the following commands:

Authenticate to the platform with your Google account with gcloud auth login
Enter the project you created with gcloud config set project PROJECT_ID
Log into the cluster with gcloud container clusters get-credentials <cluster-name> --zone <zone_of_your_cluster>
You can try to create a namespace with kubectl create namespace rl-agent

To push Docker images in the Container Registry at gcr.io/<your_project_name> you will need to authenticate Docker to the Registry. You can follow the official documentation to show how to authenticate or simply run gcloud auth configure-docker.

To enable Kubernetes to pull the container images from your private Container Registry you have to create a JSON key file and embed it as a Kubernetes secret. Create a Service Account with Viewer permissions and create the key file to be stored in /keys/ with gcloud iam service-accounts keys create KEY_FILE --iam-account=SA_NAME@PROJECT_ID.iam.gserviceaccount.com.

With Kubernetes point to the key file to create a secret and name it gcr-json-key. You will need to run the following command both for the default namespace and rl-agent so that both namespaces will be able to pull the needed images.

kubectl create secret docker-registry gcr-json-key \
--docker-server=gcr.io \
--docker-username=_json_key \
--docker-password="$(cat ~/keys/img_pull_key.json)" \
--docker-email=<service-account-email> \
--namespace rl-agent

Check that these secrets have been created correctly with kubectl get secrets.

Another key file is needed, this time for the agent to authenticate to the GKE cluster from within its pod. You can use the default service account to create a second JSON key and save it to /keys/.

⚠️ Security

This secret will be copied directly into the image and this is a security concern. A more secure implementation would be to create a Kubernetes secret and import this secret into the pod.

Monitoring tools installations 📊

Monitoring will be used for two main purposes. One is to provide the agent with the necessary information about the current state of the cluster to take the necessary action. The other is to monitor the behaviour of the intelligent autoscaler and compare it with the standard HPA.

These components are Istio, an open-source service mesh framework providing monitoring and load balancing capabilities, Prometheus and open-source time-series database used to store monitored data scraped from Istio, and Grafana, an open-source project providing a monitoring dashboard connected to the database. Moreover, the Prometheus API will be queried by the agent to get the state of the environment. How the taken action affects the environment will be visible on a Grafana dashboard.

By default, Kubernetes emits only certain basic metrics but the agent needs to know the current number of replicas that are being deployed. To enable this metrics you have to install the kube-state-metrics server. The installation is very simple:

Download the official repository with git clone https://github.com/kubernetes/kube-state-metrics
Apply its manifests for the standard installation kubectl apply -f kube-state-metrics/examples/standard/

You should now see a pod named similar to kube-state-metrics-65bf754b96-cvp75 when you list all pods with kubectl get pods -A The file prometheus.yaml instructs Prometheus to scrape from this server, among other jobs.

Service mesh with Istio ⛵️

Next install Istio 1.17.1 by following the official documentation. curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.17.1 TARGET_ARCH=x86_64 sh -

Use the default profile istioctl install --set profile=default -y and label both namespaces for the injection of the Envoy sidecar with kubectl label namespace <namespace_name> istio-injection=enabled. The sidecar is the component that will emit those metrics scraped by Prometheus.

Enabling Prometheus and Grafana

Prometheus, Kiali and Grafana are enabled with kubectl apply -f microservice-demo/kubernetes-manifests/istio-system.

Deploy the Online Boutique 🎊

Everything is ready now to deploy the Online Boutique application with all its services! A replica of the application will be deployed in the default namespace and another in the rl-agent namespace. The agent will only be present in the latter.

Change directory to the repository folder cd microservice-demo/
. Build the Docker image for the agent cd src/agent
docker build -t agent:x.x . where x.x is the version of the image
Tag the image with the path on your Container Registry like docker tag agent:x.x gcr.io/master-thesis-hpa/agent:x.x
Now push the image to the Registry with docker push gcr.io/master-thesis-hpa/agent:x.x
Do the same with the loadgenerator cd src/loadgenerator, docker build, docker tag and docker push.
Deploy the replica of the application in the default namespace with skaffold run --default-repo gcr.io/master-thesis-hpa
Now deploy the application in the rl-agent namespace. Go to cd kubernetes-manifests/rl-agent
Apply the roles that allow the agent to change the number of replicas with kubectl apply -f agent-role.yaml and agent-roleBind.yaml
Deploy all services with kubectl apply -f ./app
Apply the standard HPA configuration for the frontend service in the default namespace with kubectl apply -f /autoscaling

⚠️ Update image version in the deployment files after every new versioning

Both in /kubernetes-manifests/default/loadgeneratory.yaml and kubernetes-manifests/rl-agent/app/loadgenerator.yaml for the load generator and in kubernetes-manifests/rl-agent/app/agent.yaml.

Generating traffic load on the application with `loadgenerator`

Update the image version in /kubernetes-manifests/loadgenerator.yaml with the x.x version of your build and deploy the service with kubectl apply -f default/loadgenerator.yaml for the default namespace and kubectl apply -f rl-agent/app/loadgenerator.yaml for the other namespace.

Once deployed, both generators will simulate a load of requests to frontend coming from a varying number of users. These users will look for products, add the to the cart and buy them. User behaviour is randomised across all possible interactions with the website.

Developing the RL agent 💡

The agent can be deployed with kubectl apply -f rl-agent/app/agent.yaml. Deploying this pod will automatically start the learning phase of the algorithms, which will log and checkpoint models at microservice-demo/src/agent/. You can execute commands inside the pod with kubectl exec -it -n rl-agent agent-<id> -- /bin/sh.

Developing locally on minikube 💻

Starting minikube and building images

minikube start
minikube mount ${HOME}/Documents/microservice-demo:/app/microservice-demo to mount the volume that will be used by the agent pod to run the code developed locally
Open a new terminal
minikube docker-env and eval $(minikube -p minikube docker-env)
cd src/agent-local
docker build -t agent:<version> . Remember to change the version of the image in the deployment file to deploy that image in the cluster.
cd src/loadgenerator
docker build -t loadgenerator:<version> .

Use Istio for monitoring

Install Istio following the official documentation.
istioctl install --set profile=demo -y this will allow you to use Prometheus and Grafana
kubectl label namespace rl-agent istio-injection=enabled
Verify the labelling with kubectl get namespaces --show-labels
cd kubernetes-manifests/
kubectl -f apply ./istio-system to deploy Istio components

Deploy the Online Boutique application

Go to kubernetes-manifests/rl-agent/local
kubectl apply -f kube-manifests-local.yaml
kubectl apply -f loadgenerator-local.yaml
kubectl apply -f agent-role.yaml
kubectl apply -f agent-roleBind.yaml
kubectl apply -f agent-local.yaml
On a new terminal, to run the load generator execute the file workload_gen.py from within the pod. kubectl exec -it -n rl-agent loadgenerator-<id> -- /bin/sh
On a new terminal, to run the agent code execute kubectl exec -it -n rl-agent agent-<id> -- /bin/sh

Everything should be running locally by now. 😌 The monitoring part is the same as the one developed in GKE.

[!note] Note about volumes When creating a volume, everything you modify in the pod (including deleted files) will be reflected on the host machine where the volume is created!

[!tip] Deleting all resources To delete all resources in the cluster in a namespace kubectl delete all --all.

Resources

Tutorials I used and other references. How to a pull Docker Image from GCR in any non-GCP Kubernetes cluster. Google Kubernetes Engine(GKE) — Persistent Volume with Persistent Disks (NFS) on Multiple Nodes (ReadWriteMany). Kubernetes Monitoring: Kube-State-Metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 1,789 Commits
.github		.github
autoscaling		autoscaling
docs		docs
experiments		experiments
istio-manifests		istio-manifests
kubernetes-manifests		kubernetes-manifests
pb		pb
release		release
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

boyscout99/microservice-demo

Folders and files

Latest commit

History

Repository files navigation

Deploying an intelligent autoscaler into a Kubernetes cluster

How it works in a nutshell 🥜

Developing on a cloud platform ☁️

Setting up all components

Virtual Machine (VM) as main workstation to connect to the Kubernetes cluster

What to install in the Ubuntu VM

Network Files System (NFS) server as shared persistent disk

Create a GKE cluster

Monitoring tools installations 📊

Service mesh with Istio ⛵️

Enabling Prometheus and Grafana

Deploy the Online Boutique 🎊

Generating traffic load on the application with loadgenerator

Developing the RL agent 💡

Developing locally on minikube 💻

Starting minikube and building images

Use Istio for monitoring

Deploy the Online Boutique application

Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Generating traffic load on the application with `loadgenerator`