Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Data Race leading to panic in Scheduler (and use more efficient strings comparison) #3325

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

belo4ya
Copy link

@belo4ya belo4ya commented Feb 22, 2024

3e03d16 - fix #3324 (scheduler panic in unexpected places due to Data Race):
reading operations during the modification of the var IgnoredDevicesList []string could lead to a panic:

WARNING: DATA RACE
Read at 0x00000565d810 by goroutine 276:
  volcano.sh/volcano/pkg/scheduler/api.NewResource()
      /go/src/volcano.sh/volcano/pkg/scheduler/api/resource_info.go:89 +0x40a
  volcano.sh/volcano/pkg/scheduler/api.(*JobInfo).GetMinResources()
      /go/src/volcano.sh/volcano/pkg/scheduler/api/job_info.go:490 +0x1705
  volcano.sh/volcano/pkg/scheduler/plugins/proportion.(*proportionPlugin).OnSessionOpen.func5()
      /go/src/volcano.sh/volcano/pkg/scheduler/plugins/proportion/proportion.go:352 +0x16a1
  volcano.sh/volcano/pkg/scheduler/framework.(*Session).JobEnqueueable()
      /go/src/volcano.sh/volcano/pkg/scheduler/framework/session_plugins.go:401 +0x24a
  volcano.sh/volcano/pkg/scheduler/actions/enqueue.(*Action).Execute()
      /go/src/volcano.sh/volcano/pkg/scheduler/actions/enqueue/enqueue.go:95 +0xddd
  volcano.sh/volcano/pkg/scheduler.(*Scheduler).runOnce()
      /go/src/volcano.sh/volcano/pkg/scheduler/scheduler.go:126 +0x478
  volcano.sh/volcano/pkg/scheduler.(*Scheduler).runOnce-fm()
      <autogenerated>:1 +0x39
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
      /go/pkg/mod/k8s.io/apimachinery@v0.27.2/pkg/util/wait/backoff.go:226 +0x48
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
      /go/pkg/mod/k8s.io/apimachinery@v0.27.2/pkg/util/wait/backoff.go:227 +0xce
  k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      /go/pkg/mod/k8s.io/apimachinery@v0.27.2/pkg/util/wait/backoff.go:204 +0x10d
  k8s.io/apimachinery/pkg/util/wait.Until()
      /go/pkg/mod/k8s.io/apimachinery@v0.27.2/pkg/util/wait/backoff.go:161 +0x48
  volcano.sh/volcano/pkg/scheduler.(*Scheduler).Run.func2()
      /go/src/volcano.sh/volcano/pkg/scheduler/scheduler.go:95 +0x58

457d37b - a little more efficient string comparison:
taking into account the source code of strings.Compare:

// Compare returns an integer comparing two strings lexicographically.
// The result will be 0 if a == b, -1 if a < b, and +1 if a > b.
//
// Compare is included only for symmetry with package bytes.
// It is usually clearer and always faster to use the built-in
// string comparison operators ==, <, >, and so on.
func Compare(a, b string) int {
	// NOTE(rsc): This function does NOT call the runtime cmpstring function,
	// because we do not want to provide any performance justification for
	// using strings.Compare. Basically no one should use strings.Compare.
	// As the comment above says, it is here only for symmetry with package bytes.
	// If performance is important, the compiler should be changed to recognize
	// the pattern so that all code doing three-way comparisons, not just code
	// using strings.Compare, can benefit.
	if a == b {
		return 0
	}
	if a < b {
		return -1
	}
	return +1
}

@volcano-sh-bot
Copy link
Contributor

Welcome @belo4ya!

It looks like this is your first PR to volcano-sh/volcano 馃帀.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign k82cn
You can assign the PR to them by writing /assign @k82cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 22, 2024
@belo4ya
Copy link
Author

belo4ya commented Feb 23, 2024

/assign @k82cn

var RegisteredDevices = []string{
GPUSharingDevice, vgpu.DeviceName,
}

var IgnoredDevicesList = ignoredDevicesList{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IgnoredDevicesList is a global var and seems no place will remove items from it, is this correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't fully understand your question.

The only place where the IgnoredDevicesList is currently being overwritten (cleared and filled with new values, currently always constant) is here.

If you mean that using a global variable is not justified, then I may agree with you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't notice your Set method has reinitialized the variable.

@Monokaix
Copy link
Member

How about set IgnoredDevicesList in nodeInfo?

@Monokaix
Copy link
Member

How about set IgnoredDevicesList in nodeInfo?

Hi, any progress here?

@Vacant2333
Copy link
Contributor

any update? i can help u finish this pr, @belo4ya

@belo4ya
Copy link
Author

belo4ya commented Apr 2, 2024

any update? i can help u finish this pr, @belo4ya

I glanced at the idea of linking IgnoredDevicesList with NodeInfo - and it turned out that there is already such a relationship:

ni.Others[GPUSharingDevice].(Devices).GetIgnoredDevices(),
ni.Others[vgpu.DeviceName].(Devices).GetIgnoredDevices(),

Then, for each filtering operation of ignored resources, an IgnoredDevicesList could be calculated locally:

        var niList []NodeInfo
	ignoredDevicesList := map[string]struct{}{}
	for _, ni := range niList {
		for _, device := range ni.Others[GPUSharingDevice].(Devices).GetIgnoredDevices() {
			ignoredDevicesList[device] = struct{}{}
		}
		for _, device := range ni.Others[vgpu.DeviceName].(Devices).GetIgnoredDevices() {
			ignoredDevicesList[device] = struct{}{}
		}
	}

But I have a sharp contradiction when I go further. I don't fully understand the current work with IgnoredDevicesList - it doesn't seem right to me.

And at the same time, it seems to me that NodeInfo and IgnoredDevicesList are different levels of abstraction. IgnoredDevicesList is calculated at the time of NodeInfo initialization, and the further (and previous) initialization of NodeInfo depends on IgnoredDevicesList. And IgnoredDevicesList is an aggregate that is useless before initializing all NodeInfo.

That is, I mean that IgnoredDevicesList is not an object that should exist inside NodeInfo as an attribute. And this is an object that should be at the level with nodes []*v1.Node

It seems that such a change will require major edits and I would like to know your opinion, how do you see how this should be implemented? @Monokaix, @Vacant2333

@Monokaix
Copy link
Member

Monokaix commented May 8, 2024

any update? i can help u finish this pr, @belo4ya

I glanced at the idea of linking IgnoredDevicesList with NodeInfo - and it turned out that there is already such a relationship:

ni.Others[GPUSharingDevice].(Devices).GetIgnoredDevices(),
ni.Others[vgpu.DeviceName].(Devices).GetIgnoredDevices(),

Then, for each filtering operation of ignored resources, an IgnoredDevicesList could be calculated locally:

        var niList []NodeInfo
	ignoredDevicesList := map[string]struct{}{}
	for _, ni := range niList {
		for _, device := range ni.Others[GPUSharingDevice].(Devices).GetIgnoredDevices() {
			ignoredDevicesList[device] = struct{}{}
		}
		for _, device := range ni.Others[vgpu.DeviceName].(Devices).GetIgnoredDevices() {
			ignoredDevicesList[device] = struct{}{}
		}
	}

But I have a sharp contradiction when I go further. I don't fully understand the current work with IgnoredDevicesList - it doesn't seem right to me.

And at the same time, it seems to me that NodeInfo and IgnoredDevicesList are different levels of abstraction. IgnoredDevicesList is calculated at the time of NodeInfo initialization, and the further (and previous) initialization of NodeInfo depends on IgnoredDevicesList. And IgnoredDevicesList is an aggregate that is useless before initializing all NodeInfo.

That is, I mean that IgnoredDevicesList is not an object that should exist inside NodeInfo as an attribute. And this is an object that should be at the level with nodes []*v1.Node

It seems that such a change will require major edits and I would like to know your opinion, how do you see how this should be implemented? @Monokaix, @Vacant2333

Yeah, we can keep this: )

Signed-off-by: belo4ya <41exey.kov41ev@gmail.com>
Signed-off-by: belo4ya <41exey.kov41ev@gmail.com>
@william-wang william-wang force-pushed the fix-data-race-for-shared-devices branch from 457d37b to 88fbcfc Compare May 8, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler panic in unexpected places and forced termination of Jobs due to Data Race
5 participants