Different node selection methods for termination #105

awprice · 2018-05-17T10:44:34Z

At the moment Escalator supports only one type of mode for the selection of which nodes to terminate - oldest first. This mode just prioritises the oldest nodes in the Kubernetes API by the creation timestamp. This works well and is simple, but some more modes may be needed to support service based workloads.

This issue proposes some new node selection methods for termination, which are:

Selection of nodes based on how easily drainable the node is. This would be determined with the drain simulation package provided by the cluster-autoscaler tool.
Selection of nodes based on how utilised they are. This would be determined by prioritising nodes with less requested resources and would terminate nodes that are close to idling or have low usage.

These node selection methods could potentially be used at the same time, with a weighted sum model used to determine the "ideal" or highest scoring nodes to terminate first. The weighted sum model would apply a score to each node when evaluating it against a set of criteria. The criteria could be how old the node is, how easily it is able to be drained and finally how utilised the node is. The nodes with the highest scores overall would be prioritised for termination.

Using the utilisation based termination method by itself may lead to a situation where some nodes aren't ever terminated because they are heavily utilised. Using a weighted sum model and pairing it with the current "oldest first" method, both utilisation and how old the node is would be considered before deciding which nodes to terminate.

Cluster autoscaler drain simlator: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/simulator
Weighted sum model: https://en.wikipedia.org/wiki/Weighted_sum_model

/cc @dadux @mwhittington21

awprice added enhancement New feature or request feature labels May 17, 2018

awprice added this to Pending in Escalator via automation May 17, 2018

awprice mentioned this issue May 17, 2018

Add option to perform a drain before terminating a node #93

Open

akshayks mentioned this issue Oct 18, 2019

Default node termination policy is inappropriate for StatefulSets #177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different node selection methods for termination #105

Different node selection methods for termination #105

awprice commented May 17, 2018

Different node selection methods for termination #105

Different node selection methods for termination #105

Comments

awprice commented May 17, 2018