-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[proposal] deschedule production pods between nodes #2043
Comments
Sounds reasonable, are you interested in participating in the development? |
Great idea! Some details may need discussion:
Maybe we don't need to balance the loads between prod and batch workloads in a single node. Instead, the ability to balance Prod workloads on all nodes seems more valuable. (To avoid hotspot node with too many Prod workloads)
Hope to hear more about your ideas and welcome to participate in the development! |
I can take it after discussing the final plan. @songtao98 |
Based on the discussion we have had:
By this implementation, we only add one additional logic to evict prod pod when node total load is under TotalResourceThreshold but its prod-pod-load is beyond ProdResourceThreshold. If a node exceed its total load threshold and the prod pod load on it exceed its prod pod threshold at the same time, just evict pod by existing logic, i.e., pod with lower Priority will be evicted first. /assign @zwForrest |
BTW,
|
What is your proposal:
When the loads of the two nodes are almost the same, the production application and batch application loads of this node are also similar. We expect that the production application and the batch application will be balanced between nodes. Currently, the loadaware scheduling plugin can evaluate pod loads based on Production and Batch pod, but descheduler does not have the ability to balance based on production applications. This way the node can reduce hotspots caused by production application load.
Why is this needed:
reduce hotspots caused by production application load.
Is there a suggested solution, if so, please add it:
The text was updated successfully, but these errors were encountered: