[ENH] - Add kubernetes horizontal autoscaler for conda-store workers based on queue depth #2284

dcmcand · 2024-02-29T16:04:31Z

Feature description

Currently conda-store is set to allow 4 simultaneous builds at once. This is a bottleneck once multiple environments start getting built at once and presents a scaling challenge. If we set the simultaneous builds to 1 and autoscale based on queue depth then we should be able to handle scaling far more gracefully

Value and/or benefit

Having the conda-store workers autoscale based on queue depth will allow larger orgs to take advantage of Nebari without hitting scale bottlenecks.

Anything else?

https://learnk8s.io/scaling-celery-rabbitmq-kubernetes

pt247 · 2024-03-09T18:19:47Z

Options

We have two options to achieve this:

Option#1 Horizontal Pod Autoscaler based on external metrics and a load monitor/watcher.

Ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
The sequence of events:

Build watcher queries the conda-store database every 5 seconds and populates the total number of queued workers.
Horizontal autoscaler takes this value as an external metric to scale on:

- type: External
  external:
    metric:
      name: queue_messages_ready
      selector:
        matchLabels:
          queue: "worker_tasks"
    target:
      type: AverageValue ## This needs to change accordingly. 
      averageValue: 0

Horizontal Pod Autoscaler (HPA) creates new pods according to queued workers.

Option#2 KEDA (Kubernetes-based Event-driven Autoscaling)

Ref:
https://blogs.halodoc.io/autoscaling-k8s-deployments-with-external-metrics/
https://keda.sh/docs/2.13/scalers/
https://keda.sh/docs/2.13/concepts/external-scalers/
https://keda.sh/docs/2.13/scalers/rabbitmq-queue/
https://keda.sh/docs/2.13/scalers/redis-cluster-lists/
https://keda.sh/docs/2.13/scalers/redis-lists/
https://keda.sh/docs/2.13/scalers/postgresql/

The PGSql scaler allows us to run a query on a database. Which means we can simply point it towards the existing conda-store database to get the queue depth of pending jobs.

Pros and Cons

Option#1

Components added
1. Metrics Server
2. Horizontal Pod Autoscaler
3. A Queue like RabbitMQ
4. Custom service to manage the Queue
Pros
1. Light weight
2. Components like RabbitMQ and Metrics Server can be re-used if needed.
Cons
1. Requires managing the queue and queue-manager service.
2. Its a bit hacky solution

Option#2

Components added (ref)
1. Metric adaptor
2. Contoller
3. Scaler
4. Addmission webhooks
Pros
1. Single purpose but elegant solution based on HPA.
2. No customization needed, its full featured and provides an extendable options of scheduling scalers.
3. Same machinesum can be reused to scale other services in future.
Cons
1. The only Metrics adoptor can be re-used by other services.

pt247 · 2024-03-09T18:35:10Z

Should this be part of conda-store?

Regardless of the option we take, this can be moved upstream to conda-store.

Points in favour of moving this to conda-store
1. It solves a conda-store problem and touches only conda-store components.
2. Moving it to conda-store can make it available for all the other conda-store
Points in against of moving this to conda-store
1. Since conda-store is a core component of Nebari, we can rely on it being there. Therefore we can reuse KEDA to scale other pods as and when needed. This will become an issue in the highly unlikely event if we decide to move away from conda-store.
2. We will need to figure out if this is in line with long term road-map of conda-store.

pt247 · 2024-03-09T18:37:29Z

We should agree on these before we start. Please suggest. Thanks.

dcmcand · 2024-03-12T13:04:54Z

@pt247 Conda store already has a queue, it is using redis and celery. I expect we can pull queue depth from that, so we shouldn't need to deploy extra infra there. The nebari-conda-store-redis-master stateful set is what you are looking for.

I am unfamiliar with KEDA, but it does look promising and has a redis scaler too. In general I prefer to use built in solutions as my default, so the horizontal autoscaler was my first thought, but if KEDA allows for better results with less complexity then I can see going with that. KEDA is a cncf project that seems to be actively maintained, so that is good.

As to whether this solution belongs in conda-store, I will simply say, it does not. Conda-store allows for horizontal scaling by having a queue with a worker pool. That is where conda-store's responsibility ends. Building specific implementation details for scaling on Nebari into conda-store would cross software boundaries and greatly increase coupling between the projects. That would be moving in the wrong direction. We want to decrease coupling between conda-store and Nebari. conda-store has a method for scaling horizontally, it is on Nebari to implement autoscaling that fits its particular environment.

Adam-D-Lewis · 2024-03-12T14:44:27Z

I bet conda store devs would have comments on this, and it would be implemented in Conda store. It seems like this issue should transferred to the conda store repo to improve visibility with conda store devs.

viniciusdc · 2024-03-21T16:26:45Z

We want to decrease coupling between conda-store and Nebari. conda-store has a method for scaling horizontally, it is on Nebari to implement autoscaling that fits its particular environment.

I also agree that the conda-store already has a sound scaling system; however, we are not using this on our own deployment. Having multiple celery workers is already supported (as both Redis and Celery handle the task load balancing by themselves); we need to discuss how to handle the worker scaling on our Kubernetes infrastructure.

It's a manual process that depends on creating more workers. We need a way to automate this process. I initially suggested using the queue depth on Redis to manage this, which would trigger a CRD to change the number of replicas the worker deployment should have.

dcmcand · 2024-03-21T17:39:09Z

Either KEDA or the horizontal autoscaler would work here and both can be used to scale automatically using the queue depth. I think that KEDA seems a bit more elegant with its implementation so would suggest using that to start to see if it works and if for some reason it doesn't, then falling back to the horizontal autoscaler.

pt247 · 2024-04-05T15:30:37Z

Notes on POC

Installing KEDA:

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace dev

Scaled job spec:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scaled-conda-worker
  namespace: dev
spec:
  scaleTargetRef:
    kind:          Deployment                               # Optional. Default: Deployment
    name:          nebari-conda-store-worker  # Mandatory. Must be in the same namespace as the ScaledObject
  triggers:
  - type: postgresql
    metadata:
      query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
      targetQueryValue: "0"
      activationTargetQueryValue: "1"
      host: "nebari-conda-store-postgresql"
      userName: "postgres"
      password: "{nebari-conda-store-postgresql}"
      port: "5432"
      dbName: "conda-store"
      sslmode: disable

I have also tried this:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scaled-conda-worker
  namespace: dev
spec:
  scaleTargetRef:
    kind:          Deployment                               # Optional. Default: Deployment
    name:          nebari-conda-store-worker  # Mandatory. Must be in the same namespace as the ScaledObject
  triggers:
  - type: postgresql
    metadata:
      query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
      targetQueryValue: "0"
      activationTargetQueryValue: "1"
      host: "nebari-conda-store-postgresql.dev.svc.cluster.local"
      passwordFromEnv: PG_PASSWORD
      userName: "postgres"
      port: "5432"
      dbName: "conda-store"
      sslmode: disable

pt247 · 2024-04-05T18:46:12Z

I am getting the following error:

│ 2024-04-05T18:44:42Z    ERROR    Reconciler error    {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind" │
│ : "ScaledObject", "ScaledObject": {"name":"scaled-conda-worker","namespace":"dev"}, "namespace": "dev", "name": "scaled-conda-work │
│ er", "reconcileID": "17f8e76e-7f9d-4e9e-90e4-77dde8a455d4", "error": "error establishing postgreSQL connection: failed to connect  │
│ to `host=nebari-conda-store-postgresql.dev.svc.cluster.local user=postgres database=conda-store`: server error (FATAL: password au │
│ thentication failed for user \"postgres\" (SQLSTATE 28P01))"}

viniciusdc · 2024-04-05T19:01:38Z

Uhm, this is strange behavior; I think something might be missing... I will try to reproduce this on my side as well.

pt247 · 2024-04-07T11:47:33Z

I have also tried TriggerAuthentication:

apiVersion: v1
kind: Secret
metadata:
  name: conda-pg-credentials
  namespace: dev
type: Opaque
data:
  PG_PASSWORD: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-conda-secret
  namespace: dev
spec:
  secretTargetRef:
  - parameter: password
    name: conda-pg-credentials
    key: PG_PASSWORD
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scaled-conda-worker
  namespace: dev
spec:
  scaleTargetRef:
    kind:          Deployment                 # Optional. Default: Deployment
    name:          nebari-conda-store-worker  # Mandatory. Must be in the same namespace as the ScaledObject
  triggers:
  - type: postgresql
    metadata:
      query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
      targetQueryValue: "0"
      activationTargetQueryValue: "1"
      host: "nebari-conda-store-postgresql"
      userName: "postgres"
      port: "5432"
      dbName: "conda-store"
      sslmode: disable
    authenticationRef:
      name: keda-trigger-auth-conda-secret

pt247 · 2024-04-07T12:40:24Z

This worked:

It turns out that the secrets need to be base encoded.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: trigger-auth-postgres
  namespace: dev
spec:
  secretTargetRef:
  - parameter: password
    name: nebari-conda-store-postgresql
    key: postgresql-password
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scaled-conda-worker
  namespace: dev
spec:
  scaleTargetRef:
    kind: Deployment
    name: nebari-conda-store-worker
  triggers:
  - type: postgresql
    metadata:
      query: "SELECT COUNT(*) FROM build WHERE status='BUILDING' OR status='QUEUED';"
      targetQueryValue: "1"
      host: "nebari-conda-store-postgresql"
      userName: "postgres"
      port: "5432"
      dbName: "conda-store"
      sslmode: disable
    authenticationRef:
      name: trigger-auth-postgres

pt247 · 2024-04-09T12:16:20Z

Performance imporvements

We try and create 5 conda environments the fifth environment we add sciket-learn.

Current develop branch

Time: 5 minutes 11 seconds
Number of conda-store workers: 1

Default KEDA

Time: 4 minutes 29 seconds
Number of conda-store workers scaled to: 2

With min replica count set to 1 default is 0

Time: 2 minutes 35 seconds
Number of conda-store workers scaled to: 2

With min replica count set to 1 default is 0 + Pooling interval of 15 seconds (default is 30 seconds)

Time: 4 minutes 14 seconds

pollingInterval: 5 and minimum replica count: 1 we track state building as well

  minReplicaCount: 1   # Default: 0
  pollingInterval: 5   # Default:  30 seconds
  cooldownPeriod: 60  
 time taken: 3:40

dcmcand added type: enhancement 💅🏼 New feature or request area: integration/conda-store area: k8s ⎈ labels Feb 29, 2024

dcmcand assigned pt247 Mar 6, 2024

pt247 linked a pull request Apr 8, 2024 that will close this issue

Add KEDA HPA TriggerAuthentication and postgresql ScaledObject. #2384

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] - Add kubernetes horizontal autoscaler for conda-store workers based on queue depth #2284

[ENH] - Add kubernetes horizontal autoscaler for conda-store workers based on queue depth #2284

dcmcand commented Feb 29, 2024

pt247 commented Mar 9, 2024

pt247 commented Mar 9, 2024

pt247 commented Mar 9, 2024

dcmcand commented Mar 12, 2024

Adam-D-Lewis commented Mar 12, 2024 •

edited

viniciusdc commented Mar 21, 2024 •

edited

dcmcand commented Mar 21, 2024

pt247 commented Apr 5, 2024 •

edited

pt247 commented Apr 5, 2024

viniciusdc commented Apr 5, 2024 •

edited

pt247 commented Apr 7, 2024

pt247 commented Apr 7, 2024 •

edited

pt247 commented Apr 9, 2024 •

edited

[ENH] - Add kubernetes horizontal autoscaler for conda-store workers based on queue depth #2284

[ENH] - Add kubernetes horizontal autoscaler for conda-store workers based on queue depth #2284

Comments

dcmcand commented Feb 29, 2024

Feature description

Value and/or benefit

Anything else?

pt247 commented Mar 9, 2024

Options

Option#1 Horizontal Pod Autoscaler based on external metrics and a load monitor/watcher.

Option#2 KEDA (Kubernetes-based Event-driven Autoscaling)

Pros and Cons

Option#1

Option#2

pt247 commented Mar 9, 2024

Should this be part of conda-store?

pt247 commented Mar 9, 2024

dcmcand commented Mar 12, 2024

Adam-D-Lewis commented Mar 12, 2024 • edited

viniciusdc commented Mar 21, 2024 • edited

dcmcand commented Mar 21, 2024

pt247 commented Apr 5, 2024 • edited

Notes on POC

Installing KEDA:

Scaled job spec:

pt247 commented Apr 5, 2024

viniciusdc commented Apr 5, 2024 • edited

pt247 commented Apr 7, 2024

pt247 commented Apr 7, 2024 • edited

pt247 commented Apr 9, 2024 • edited

Performance imporvements

Current develop branch

Default KEDA

With min replica count set to 1 default is 0

With min replica count set to 1 default is 0 + Pooling interval of 15 seconds (default is 30 seconds)

pollingInterval: 5 and minimum replica count: 1 we track state building as well

Adam-D-Lewis commented Mar 12, 2024 •

edited

viniciusdc commented Mar 21, 2024 •

edited

pt247 commented Apr 5, 2024 •

edited

viniciusdc commented Apr 5, 2024 •

edited

pt247 commented Apr 7, 2024 •

edited

pt247 commented Apr 9, 2024 •

edited