Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to adjust job OOM score #363

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

koct9i
Copy link
Collaborator

@koct9i koct9i commented Jan 31, 2024

No description provided.

// used in Kubernetes.
//
// TODO(khlebnikov): Use memory limit dedicated for user jobs cgroup.
return MaxOOMScore - (MaxOOMScore * reservedMemory + totalMemory - 1) / totalMemory;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this TODO be easily fixed by taking UserJobs limit from memory usage tracker?

@@ -987,6 +987,24 @@ bool TJob::ResourceUsageOverdrafted() const
return TResourceHolder::GetResourceUsage().UserMemory > RequestedMemory_;
}

i64 TJob::GetOOMScoreAdjustment() const
{
auto totalMemory = Bootstrap_->GetMemoryUsageTracker()->GetTotalLimit();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use JobResourceManager->GetResourceLimits() here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Thist must be common cgroup memory limit for all jobs.
Thiere is no such thing yet.

And more likely this should be part of NodeResouceManager.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this must be common cgroup memory limit for all jobs. And this limit must be returned by JobResourceManager->GetResourceLimits().

// used in Kubernetes.
//
// TODO(khlebnikov): Use memory limit dedicated for user jobs cgroup.
return MaxOOMScore - (MaxOOMScore * reservedMemory + totalMemory - 1) / totalMemory;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is a suitable formula.
Why does OOMScoreAdjustment decrease from an increase in memory demand?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formula shifts score up but removes penalty for usage below reserved.

Main effect - task with usage above reserve should have bigger score than task which
usage below reserve, regardless of sizes of these reserves.

This all is better-than-nothing estimation for containers with single big process.

The same logic in k8s: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/qos/policy.go

Actual oom-badness (non-normalized score) in kernel:
https://github.com/torvalds/linux/blob/master/mm/oom_kill.c#L201

task oom score ~= tasks memory usage / total memory size + oom score adj

@@ -987,6 +987,24 @@ bool TJob::ResourceUsageOverdrafted() const
return TResourceHolder::GetResourceUsage().UserMemory > RequestedMemory_;
}

i64 TJob::GetOOMScoreAdjustment() const
{
auto totalMemory = Bootstrap_->GetMemoryUsageTracker()->GetTotalLimit();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this must be common cgroup memory limit for all jobs. And this limit must be returned by JobResourceManager->GetResourceLimits().

@koct9i koct9i marked this pull request as draft March 5, 2024 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants