title | description | date | categories | tags | weight | ||
---|---|---|---|---|---|---|---|
Scaling Spin App With Kubernetes Event-Driven Autoscaling (KEDA) |
This tutorial illustrates how one can horizontally scale Spin Apps in Kubernetes using Kubernetes Event-Driven Autoscaling (KEDA) |
2024-02-16 |
|
|
100 |
KEDA extends Kubernetes to provide event-driven scaling capabilities, allowing it to react to events from Kubernetes internal and external sources using KEDA scalers. KEDA provides a wide variety of scalers to define scaling behavior base on sources like CPU, Memory, Azure Event Hubs, Kafka, RabbitMQ, and more. We use a ScaledObject
to dynamically scale the instance count of our SpinApp to meet the demand.
We use k3d to run a Kubernetes cluster locally as part of this tutorial, but you can follow these steps to configure KEDA autoscaling on your desired Kubernetes environment.
Please see the following sections in the [Prerequisites]({{< ref "prerequisites" >}}) page and fulfil those prerequisite requirements before continuing:
- [kubectl]({{< ref "prerequisites#kubectl" >}}) - the Kubernetes CLI
- [k3d]({{< ref "prerequisites#k3d" >}}) - a lightweight Kubernetes distribution that runs on Docker
- [Docker]({{< ref "prerequisites#docker" >}}) - for running k3d
- [Helm]({{< ref "prerequisites#helm" >}}) - the package manager for Kubernetes
- [Bombardier]({{< ref "prerequisites#bombardier" >}}) - cross-platform HTTP benchmarking CLI
If you haven't already, please go ahead and clone the Spin Operator repository:
git clone https://github.com/spinkube/spin-operator.git
Change into the Spin Operator directory:
cd spin-operator
Run the following command to create a Kubernetes cluster that has the containerd-wasm-shims pre-requisites installed: If you have a Kubernetes cluster already, please feel free to use it:
k3d cluster create wasm-cluster-scale --image ghcr.io/deislabs/containerd-wasm-shims/examples/k3d:v0.11.0 -p "8081:80@loadbalancer" --agents 2
Next, from within the spin-operator
directory, run the following commands to install the Spin runtime class and Spin Operator:
kubectl apply -f config/samples/spin-runtime-class.yaml
make install
Lastly, start the operator locally with the following command:
make run
Great, now you have Spin Operator up and running on your cluster. This means you’re set to create and deploy SpinApps later on in the tutorial.
Use the following command to set up ingress on your Kubernetes cluster. This ensures traffic can reach your SpinApp once we’ve created it in future steps:
# Setup ingress following this tutorial https://k3d.io/v5.4.6/usage/exposing_services/
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx
annotations:
ingress.kubernetes.io/ssl-redirect: "false"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: keda-spinapp
port:
number: 80
EOF
Hit enter to create the ingress resource.
Use the following command to setup KEDA on your Kubernetes cluster using Helm. Different deployment methods are described at Deploying KEDA on keda.sh:
# Add the Helm repository
helm repo add kedacore https://kedacore.github.io/charts
# Update your Helm repositories
helm repo update
# Install the keda Helm chart into the keda namespace
helm install keda kedacore/keda --namespace keda --create-namespace
Next up we’re going to build the SpinApp we will be scaling and storing inside of a ttl.sh registry. We've chosen TTL for ease of set-up, but you're welcome to use any OCI registry of your choosing, Change into the apps/cpu-load-gen directory and build the SpinApp we’ve provided:
# Build and publish the sample app
cd apps/cpu-load-gen
spin build
spin registry push ttl.sh/cpu-load-gen:1h
Note that the tag at the end of ttl.sh/cpu-load-gen:1h indicates how long the image will last e.g. 1h
(1 hour). The maximum is 24h
and you will need to repush if ttl exceeds 24 hours.
We can take a look at the SpinApp and the KEDA ScaledObject definitions in our deployment files below. As you can see, we have explicitly specified resource limits to 500m
of cpu
(spec.resources.limits.cpu
) and 500Mi
of memory
(spec.resources.limits.memory
) per SpinApp:
# config/samples/keda-app.yaml
apiVersion: core.spinoperator.dev/v1alpha1
kind: SpinApp
metadata:
name: keda-spinapp
spec:
# TODO: Depend on a ghcr.io version of this image
image: "ttl.sh/cpu-load-gen:1h"
executor: containerd-shim-spin
enableAutoscaling: true
replicas: 1
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 400Mi
---
We will scale the instance count when we’ve reached a 50% utilization in cpu
(spec.triggers[cpu].metadata.value
). We’ve also instructed KEDA to scale our SpinApp horizontally within the range of 1 (spec.minReplicaCount
) and 20 (spec.maxReplicaCount
).:
# config/samples/keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: cpu-scaling
spec:
scaleTargetRef:
name: keda-spinapp
minReplicaCount: 1
maxReplicaCount: 20
triggers:
- type: cpu
metricType: Utilization
metadata:
value: "50"
The Kubernetes documentation is the place to learn more about limits and requests. Consult the KEDA documentation to learn more about ScaledObject and KEDA's built-in scalers.
Let’s deploy the SpinApp and the KEDA ScaledObject instance onto our cluster with the following command:
# Deploy the SpinApp
kubectl apply -f config/samples/keda-app.yaml
spinapp.core.spinoperator.dev/keda-spinapp created
# Deploy the ScaledObject
kubectl apply -f config/samples/keda-scaledobject.yaml
scaledobject.keda.sh/cpu-scaling created
You can see your running Spin application by running the following command:
kubectl get spinapps
NAME READY REPLICAS EXECUTOR
keda-spinapp 1 containerd-shim-spin
You can also see your KEDA ScaledObject instance with the following command:
kubectl get scaledobject
NAME SCALETARGETKIND SCALETARGETNAME MIN MAX TRIGGERS READY ACTIVE AGE
cpu-scaling apps/v1.Deployment keda-spinapp 1 20 cpu True True 7m
Now let’s use Bombardier to generate traffic to test how well KEDA scales our SpinApp. The following Bombardier command will attempt to establish 40 connections during a period of 3 minutes (or less). If a request is not responded to within 5 seconds that request will timeout:
# Generate a bunch of load
bombardier -c 40 -t 5s -d 3m http://localhost:8081
To watch the load, we can run the following command to get the status of our deployment:
kubectl describe deploy keda-spinapp
...
---
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: keda-spinapp-76db5d7f9f (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 84s deployment-controller Scaled up replica set hpa-spinapp-76db5d7f9f to 2 from 1
Normal ScalingReplicaSet 69s deployment-controller Scaled up replica set hpa-spinapp-76db5d7f9f to 4 from 2
Normal ScalingReplicaSet 54s deployment-controller Scaled up replica set hpa-spinapp-76db5d7f9f to 8 from 4
Normal ScalingReplicaSet 39s deployment-controller Scaled up replica set hpa-spinapp-76db5d7f9f to 16 from 8
Normal ScalingReplicaSet 24s deployment-controller Scaled up replica set hpa-spinapp-76db5d7f9f to 20 from 16