You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
KPA gathers statistics via a moving average across pod replicas given a time window. I am wondering if we could provide something smarter and also deal with some cold start issues eg. don't scale down to zero if a traffic burst is about to happen.
This could be implemented as knative-extension as Knative services could be updated externally (no need to change kpa).
There is a lot of history on the topic, see [1] for more. This feature is already offered, for example at the node level, by cloud providers, see [2]. See also the KEDA related issue [3]. I am creating this issue also as a ref for future discussions in case there is interest from the community.
We also experience the issues mentioned here. I was initially hoping to integrate some redundancy option, so that I could always add x pods to the deployment on top of what kpa predicts. But I would much rather like some predictive scaling or options for also integrating cyclical workloads or similar.
As a first step for me, could I integrate this redundancy as a knative-extension and deploy it myself? Are there guides for doing that?
Describe the feature
KPA gathers statistics via a moving average across pod replicas given a time window. I am wondering if we could provide something smarter and also deal with some cold start issues eg. don't scale down to zero if a traffic burst is about to happen.
This could be implemented as knative-extension as Knative services could be updated externally (no need to change kpa).
There is a lot of history on the topic, see [1] for more. This feature is already offered, for example at the node level, by cloud providers, see [2]. See also the KEDA related issue [3]. I am creating this issue also as a ref for future discussions in case there is interest from the community.
Refs
[1] Lucia Schuler, Somaya Jamil, Niklas Kühl, AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments.
[2] Predictive scaling for Amazon EC2 Auto Scaling
[3] kedacore/keda#2401
cc @dprotaso @ReToCode
The text was updated successfully, but these errors were encountered: