Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Umbrella] Implement QueueingHintFn in in-tree plugins #118893

Open
8 of 12 tasks
Tracked by #122597
sanposhiho opened this issue Jun 27, 2023 · 79 comments
Open
8 of 12 tasks
Tracked by #122597

[Umbrella] Implement QueueingHintFn in in-tree plugins #118893

sanposhiho opened this issue Jun 27, 2023 · 79 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sanposhiho
Copy link
Member

sanposhiho commented Jun 27, 2023

What would you like to be added?

In #118551, we implemented QueueingHintFn in EventsToRegister which allows each plugin to filter out useless events to requeue Pod back to the activeQ. (See the PR or the original proposal about the details)
Basically, all in-tree plugins, which implements PreFilter or Filter, should have QueueingHintFn to achieve the efficient requeueing.

Please feel free to assign what you'd like to work on. And please comment in this issue before starting to work on so that we can avoid many people from working on the same thing!

/sig scheduling
/triage accepted
/priority important-longterm

Why is this needed?

It will contribute to the enqueue efficiency a lot and thus overall scheduling performance should get improved significantly by preventing the useless retry on the scheduling as much as possible.

@sanposhiho sanposhiho added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 27, 2023
@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jun 27, 2023
@sanposhiho
Copy link
Member Author

@pohly I assigned DRA to you 🙇 Could you raise a PR for DRA like #117844 implements with adding some unit tests? That'll be blocked by #118529 eventually though.

@carlory
Copy link
Member

carlory commented Jun 27, 2023

Hi @sanposhiho , I would like to work on this issue. I'll pick up the NodeAffinity, NodePorts and NodeResourceFit.

@carlory
Copy link
Member

carlory commented Jun 27, 2023

/assign

@wackxu
Copy link
Contributor

wackxu commented Jun 28, 2023

I'd like to work on NodeUnschedulable,TaintToleration
/assign

@nayihz
Copy link
Contributor

nayihz commented Jun 28, 2023

I'll pick up the TopologySpread
/assign

@sanposhiho
Copy link
Member Author

sanposhiho commented Jun 28, 2023

@czybjtu @wackxu @carlory
Thanks, I assigned them to you all. Please ping me in your PRs so that I can give some looks.
Also, feel free to reach out me if you have any questions around (either here or the k8s slack), I know it may be kind of difficult to fully understand the concept because there is actually no place yet using QueueingHintFn.

@utam0k
Copy link
Member

utam0k commented Jun 28, 2023

I am interested in it. assign me.

@sanposhiho
Copy link
Member Author

sanposhiho commented Jun 28, 2023

/assign @utam0k

Which one would you like to pick up?

@utam0k
Copy link
Member

utam0k commented Jun 28, 2023

/assign @utam0k

Which one would you like to pick up?

Anything, but I'm happy if it's a plugin I haven't touched other issues.

@sanposhiho
Copy link
Member Author

I assigned CSILimits/nonCSILimits to you.

@y-taka-23
Copy link
Contributor

Hello @sanposhiho, could you assign me too if you don't mind?

@utam0k
Copy link
Member

utam0k commented Jun 28, 2023

@sanposhiho 🙏 👍

@sanposhiho
Copy link
Member Author

@y-taka-23 assigned VolumeBinding to you!

@y-taka-23
Copy link
Contributor

@sanposhiho Thx!

@Gekko0114
Copy link
Member

Hi @sanposhiho ,

If there are any remaining items, could you assign it to me?

@HirazawaUi
Copy link
Contributor

I'd like to work for VolumeRestriction
/assign

@sanposhiho
Copy link
Member Author

sanposhiho commented Jun 29, 2023

@Gekko0114 @HirazawaUi I assigned VolumeRestriction and VolumeZone to you each

@Gekko0114
Copy link
Member

Sure, thank you so much!

@z1cheng
Copy link
Member

z1cheng commented Jul 1, 2023

Hey @sanposhiho, could you please assign it to me if there are any remaining items? :)

@sanposhiho
Copy link
Member Author

@z1cheng Unfortunately, no more slots here.

@pohly
Copy link
Contributor

pohly commented Apr 26, 2024

I agree that it's surprising. Perhaps it helps avoid some lengthy code path more often - but that's purely speculation.

I suppose the bottom line is: queuing hints help a bit with structured parameters, but are not required.

@alculquicondor
Copy link
Member

Thanks.

In any case, I think we are going down the path of keeping each plugin hint as simple as possible and trying to optimize the overall framework.

Maybe the hint for DRA will have a chance to simplify as we progress with the semantic model and focus on hints just to optimize that path.

@sanposhiho
Copy link
Member Author

Regarding the throughput topic mentioned here, we discussed at #124384 and just created the one for tracking: #124566

@bells17
Copy link
Contributor

bells17 commented Apr 29, 2024

@y-taka-23 (cc: @AxeZhan @Gekko0114 @sanposhiho) Regarding the implementation of the QueueingHintFn for VolumeBinding, if you are unable to secure time to work on it, would it be okay for me to implement it?
#118893 (comment)

@sanposhiho
Copy link
Member Author

@bells17 Feel free to proceed if no response from them for a while, e.g., after GW.
Also, if @AxeZhan is still interested in taking part of it, you both can discuss and decide which event to work on. (There're many events registered in VolumeBinding.)

@AxeZhan
Copy link
Member

AxeZhan commented May 1, 2024

@bells17 Feel free to start working on it, I can help reviewing :)
Notice that the logic of volume binding is pretty complex(there involves finding pv,pvc and sc), and we may want to reduce the complexity in queueingHint functions.

@YamasouA
Copy link

YamasouA commented May 4, 2024

Hi.
I am interested in implementing this feature.
I would like to participate this.

@sanposhiho
Copy link
Member Author

sanposhiho commented May 7, 2024

@bells17 @YamasouA
Please leave a comment about which event you want to pick up before starting, in order to prevent conflicting work.
Here's the list of events in VolumeBinding.

// Pods may fail because of missing or mis-configured storage class
// (e.g., allowedTopologies, volumeBindingMode), and hence may become
// schedulable upon StorageClass Add or Update events.
{Event: framework.ClusterEvent{Resource: framework.StorageClass, ActionType: framework.Add | framework.Update}},
// We bind PVCs with PVs, so any changes may make the pods schedulable.
{Event: framework.ClusterEvent{Resource: framework.PersistentVolumeClaim, ActionType: framework.Add | framework.Update}},
{Event: framework.ClusterEvent{Resource: framework.PersistentVolume, ActionType: framework.Add | framework.Update}},
// Pods may fail to find available PVs because the node labels do not
// match the storage class's allowed topologies or PV's node affinity.
// A new or updated node may make pods schedulable.
//
// A note about UpdateNodeTaint event:
// NodeAdd QueueingHint isn't always called because of the internal feature called preCheck.
// As a common problematic scenario,
// when a node is added but not ready, NodeAdd event is filtered out by preCheck and doesn't arrive.
// In such cases, this plugin may miss some events that actually make pods schedulable.
// As a workaround, we add UpdateNodeTaint event to catch the case.
// We can remove UpdateNodeTaint when we remove the preCheck feature.
// See: https://github.com/kubernetes/kubernetes/issues/110175
{Event: framework.ClusterEvent{Resource: framework.Node, ActionType: framework.Add | framework.UpdateNodeLabel | framework.UpdateNodeTaint}},
// We rely on CSI node to translate in-tree PV to CSI.
{Event: framework.ClusterEvent{Resource: framework.CSINode, ActionType: framework.Add | framework.Update}},
// When CSIStorageCapacity is enabled, pods may become schedulable
// on CSI driver & storage capacity changes.
{Event: framework.ClusterEvent{Resource: framework.CSIDriver, ActionType: framework.Add | framework.Update}},
{Event: framework.ClusterEvent{Resource: framework.CSIStorageCapacity, ActionType: framework.Add | framework.Update}},

@bells17
Copy link
Contributor

bells17 commented May 7, 2024

I want to pick up the following list of events:

  • StorageClass (Add/Update)
  • PVC (Add/Update)
  • PV (Add/Update)
  • CSIStorageCapacity (Add/Update)

@YamasouA
Copy link

YamasouA commented May 7, 2024

I would like to pick up the other three events.

@Gekko0114
Copy link
Member

Gekko0114 commented May 18, 2024

@sanposhiho and others,
Sorry, I don't follow the discussion in detail.
I've created a PR #119373 several months ago, and I've rebased the PR today because of conflict.
And I am waiting for an approval from sig-storage team members.

Should I change the scope of this PR? Or can I just wait for an approval?

@AxeZhan
Copy link
Member

AxeZhan commented May 18, 2024

@Gekko0114
Considering the context of #119373 , I think it's ok to keep it as it is now.
It's been a real long time since sig-storage reviewed ... Maybe you can try reach the slack channel of sig-storage.

@bells17
Copy link
Contributor

bells17 commented May 19, 2024

I think there are changes that require sig-storage's review, including VolumeZone, VolumeRestriction, and possibly VolumeBinding, which is currently being implemented.
So, if you're planning to request a review via Slack, it might be better for communication to ask sig-storage to review these three PRs together.

@sanposhiho @HirazawaUi By the way, I noticed that the following PR hasn't been updated for a while. Would you give me an update on its current status?
If you're having trouble finding time to work on the changes in this PR due to other commitments, I'd be happy to take over. Would you be open to that?
#119405

@Gekko0114
Copy link
Member

I think there are changes that require sig-storage's review, including VolumeZone, VolumeRestriction, and possibly VolumeBinding, which is currently being implemented.
So, if you're planning to request a review via Slack, it might be better for communication to ask sig-storage to review these three PRs together.

Thank you for your suggestion! I already asked via slack here with @HirazawaUi several months ago. It is being stacked by sig-storage team. I will ping them again.
https://kubernetes.slack.com/archives/C09QZFCE5/p1706531933985289?thread_ts=1701594690.661529&cid=C09QZFCE5

@sanposhiho
Copy link
Member Author

sanposhiho commented May 20, 2024

Let's narrow down the scope of each PR by, for example, creating a PR per event instead of per plugin so that people can take a look more easily. Also, additionally you can attend their SIG meeting to ask for help directly (I know many people discussing here would have a timezone problem though).

@carlory
Copy link
Member

carlory commented May 20, 2024

@sanposhiho I can help to review storage pr if I have time. please assign/cc it to me if it's ready for review.

@Gekko0114
Copy link
Member

Gekko0114 commented May 20, 2024

I've divided volume_zone's PR into 4 parts.
Please let me know it is OK. If so, I will ask sig-storage to review these PRs.

#125000
#125001
#124996
#124998

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Backlog
Development

No branches or pull requests