New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to adjust job OOM score #363
base: main
Are you sure you want to change the base?
Conversation
// used in Kubernetes. | ||
// | ||
// TODO(khlebnikov): Use memory limit dedicated for user jobs cgroup. | ||
return MaxOOMScore - (MaxOOMScore * reservedMemory + totalMemory - 1) / totalMemory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this TODO be easily fixed by taking UserJobs limit from memory usage tracker?
@@ -987,6 +987,24 @@ bool TJob::ResourceUsageOverdrafted() const | |||
return TResourceHolder::GetResourceUsage().UserMemory > RequestedMemory_; | |||
} | |||
|
|||
i64 TJob::GetOOMScoreAdjustment() const | |||
{ | |||
auto totalMemory = Bootstrap_->GetMemoryUsageTracker()->GetTotalLimit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use JobResourceManager->GetResourceLimits() here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Thist must be common cgroup memory limit for all jobs.
Thiere is no such thing yet.
And more likely this should be part of NodeResouceManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this must be common cgroup memory limit for all jobs. And this limit must be returned by JobResourceManager->GetResourceLimits().
// used in Kubernetes. | ||
// | ||
// TODO(khlebnikov): Use memory limit dedicated for user jobs cgroup. | ||
return MaxOOMScore - (MaxOOMScore * reservedMemory + totalMemory - 1) / totalMemory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it is a suitable formula.
Why does OOMScoreAdjustment decrease from an increase in memory demand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This formula shifts score up but removes penalty for usage below reserved.
Main effect - task with usage above reserve should have bigger score than task which
usage below reserve, regardless of sizes of these reserves.
This all is better-than-nothing estimation for containers with single big process.
The same logic in k8s: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/qos/policy.go
Actual oom-badness (non-normalized score) in kernel:
https://github.com/torvalds/linux/blob/master/mm/oom_kill.c#L201
task oom score ~= tasks memory usage / total memory size + oom score adj
@@ -987,6 +987,24 @@ bool TJob::ResourceUsageOverdrafted() const | |||
return TResourceHolder::GetResourceUsage().UserMemory > RequestedMemory_; | |||
} | |||
|
|||
i64 TJob::GetOOMScoreAdjustment() const | |||
{ | |||
auto totalMemory = Bootstrap_->GetMemoryUsageTracker()->GetTotalLimit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this must be common cgroup memory limit for all jobs. And this limit must be returned by JobResourceManager->GetResourceLimits().
No description provided.