New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SMP: only enter scheduler if node holds BKL #1029
base: master
Are you sure you want to change the base?
Conversation
In the IPI remote call paths where target nodes enter the kernel under the protection of the calling core's BKL, the scheduler may be invoked after handling the interrupt. The MCS kernel protects against this by doing a check, however, the commit that added this check did not do the same for the non-MCS kernel, so we remove the ifdef and check for both MCS and non-MCS. Signed-off-by: Andy Bui <andy.bui2001@gmail.com>
Nice find. Do you have a test case that fails before and passes after? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the change achieves what the commit message says, though. It looks to me like this this now ends up calling the scheduler for any interrupt, not just for IPIs or on SMP configs. E.g. the
if (SMP_TERNARY(clh_is_self_in_queue(), 1))
translates to just
if (1) {
in the verified unicore configs.
This might already be unintended on MCS, I'm not clear on that.
Ignore my ramblings. We do want the scheduler to always be called on unicore, so this is fine -- it already is being called, I just misread the diff. |
I could be wrong about this but I don't think there is a problem. On non-MCS, if |
Agree with @jearc. This might not currently break anything, however, IMO it is assumed that the scheduler is called under the protection of the BKL. We should make this explicit with either an assertion/check, and rely less on everything playing nicely (such as the scheduler action only being |
I'm happy with that line of reasoning. If we ever get to verify the SMP config, this would make it easier. @kent-mcleod any concerns from your side? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit that added this if-check is 798fb76. I guess that MCS and SMP were added around the same time and that SMP issues found were fixed for MCS-only sometimes, like here.
The proof test fails, which means this will need a (hopefully trivial) proof update before it can get merged. |
if (SMP_TERNARY(clh_is_self_in_queue(), 1)) { | ||
#endif | ||
schedule(); | ||
activateThread(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IpiRemoteCall_VMCheckBoundNotification
on x86 looks like it is relying on the call to schedule on the way out of the ipi. So there's potentially impact there if this were to be changed (and maybe it's broken on MCS as it is and could explain some MCS + SMP odd behavior reported in the past: #118 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the performance penalty for a IpiRemoteCall_VMCheckBoundNotification
+ irq_reschedule_ipi
would be too big?
The MCS + SMP case here looks like it's incorrect and presents a race condition, as the target core's scheduling queues become a critical section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andybui01 can you expand upon this a bit? I'm actually investigating a problem where a VM is getting starved of execution time, and as far as I can tell, the userspace VCPU execution loops aren't the culprit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really familiar with the x86 virtualization support but here's what I recall: currently the VMCheckBoundNotification()
function that runs on the destination core may ask for a switch to a new thread. This new thread gets scheduled and activated in handleInterruptEntry@syscall.c
. This can work because remote call IPIs operate with the acquired BKL of the calling core.
If we follow this PR so that schedule()
checks that the current core holds the BKL, this IPI will break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andybui01 can you expand upon this a bit? I'm actually investigating a problem where a VM is getting starved of execution time, and as far as I can tell, the userspace VCPU execution loops aren't the culprit.
If you are running multiple tasks on the same core that are woken up by notifications, perhaps #902 is related. To quickly test if this is the cause, enable signal fast paths and change the SCHED_ENQUEUE
with SCHED_ENQUEUE
at
Line 714 in 0260fba
SCHED_APPEND(dest); |
But try removing the whole if-check and calling schedule unconditionally first.
I'm not really familiar with the x86 virtualization support but here's what I recall: currently the
VMCheckBoundNotification()
function that runs on the destination core may ask for a switch to a new thread. This new thread gets scheduled and activated inhandleInterruptEntry@syscall.c
. This can work because remote call IPIs operate with the acquired BKL of the calling core.If we follow this PR so that
schedule()
checks that the current core holds the BKL, this IPI will break.
Everyone takes the lock if IPI is not a remote call, but only x86 wants a schedule if the remote call is IpiRemoteCall_VMCheckBoundNotification
.
I don't understand why x86 has a IpiRemoteCall_VMCheckBoundNotification
IPI instead of just using the reschedule IPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an x86 system, so the Notification fastpath isn't an option I can work with.
In the IPI remote call paths where target nodes enter the kernel under the protection of the calling core's BKL, the scheduler may be invoked after handling the interrupt. The MCS kernel protects against this by doing a check, however, the commit that added this check did not do the same for the non-MCS kernel, so we remove the ifdef and check for both MCS and non-MCS.