Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different node selection methods for termination #105

Open
awprice opened this issue May 17, 2018 · 0 comments
Open

Different node selection methods for termination #105

awprice opened this issue May 17, 2018 · 0 comments
Labels
enhancement New feature or request feature
Projects

Comments

@awprice
Copy link
Member

awprice commented May 17, 2018

At the moment Escalator supports only one type of mode for the selection of which nodes to terminate - oldest first. This mode just prioritises the oldest nodes in the Kubernetes API by the creation timestamp. This works well and is simple, but some more modes may be needed to support service based workloads.

This issue proposes some new node selection methods for termination, which are:

  • Selection of nodes based on how easily drainable the node is. This would be determined with the drain simulation package provided by the cluster-autoscaler tool.
  • Selection of nodes based on how utilised they are. This would be determined by prioritising nodes with less requested resources and would terminate nodes that are close to idling or have low usage.

These node selection methods could potentially be used at the same time, with a weighted sum model used to determine the "ideal" or highest scoring nodes to terminate first. The weighted sum model would apply a score to each node when evaluating it against a set of criteria. The criteria could be how old the node is, how easily it is able to be drained and finally how utilised the node is. The nodes with the highest scores overall would be prioritised for termination.

Using the utilisation based termination method by itself may lead to a situation where some nodes aren't ever terminated because they are heavily utilised. Using a weighted sum model and pairing it with the current "oldest first" method, both utilisation and how old the node is would be considered before deciding which nodes to terminate.

Cluster autoscaler drain simlator: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/simulator
Weighted sum model: https://en.wikipedia.org/wiki/Weighted_sum_model

/cc @dadux @mwhittington21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature
Projects
No open projects
Escalator
  
Pending
Development

No branches or pull requests

1 participant