Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IRQ Fastpath #1227

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

danshea00
Copy link
Contributor

Add an IRQ fastpath for aarch64 MCS. This fastpath is similar to the signal fastpath but also handles the case when the destination thread is of higher priority than the interrupted thread, in which case a direct context switch occurs.

seL4Bench IRQUser benchmarks (non context switching benchmark is here):

Context switching Non context switching
Slowpath Fastpath Diff Slowpath Fastpath Diff
imx8mm 808 (13) 594 (11) 26% 762 (10) 367 (3) 52%
imx8mq 1365 (3) 625 (3) 54% 1393 (20) 367 (3) 74%
odroidc4 836 (5) 745 (15) 11% 814 (49) 512 (39) 37%
tqma 863 (5) 757 (10) 12% 868 (7) 468 (3) 46%
tx1 810 (5) 735 (10) 9% 767 (6) 489 (5) 36%
zcu106 705 (10) 557 (12) 21% 749 (14) 492 (44) 34%

@Indanz
Copy link
Contributor

Indanz commented Mar 19, 2024

First impression: There is something funny going on with imx8mq and the speedup isn't big enough to justify a fast path. I would try optimising the normal IRQ path first.

Have you tested with SMP too?

@danshea00
Copy link
Contributor Author

I agree that something is wrong with the slowpath for the imx8mq - most of the speedup in that case is from using the fastpath version of switchToThread.

I have done some SMP testing. seL4Test passes with the SMP tests enabled.

@lsf37
Copy link
Member

lsf37 commented Mar 26, 2024

Do we know how much slower the slow path becomes with this patch? The additional checks are probably not much, but it would be nice to quantify.

If there is a chance to improve the slow path speed to get to similar results, that would be great of course. @danshea00 can you comment on which bits in here you think brought the highest payoff in performance improvement?

@danshea00
Copy link
Contributor Author

danshea00 commented Mar 27, 2024

Most of the speedup is from the simplified scheduling logic. Here is a rough breakdown of where time is going on the imx8mm (update timestamp also includes budget checking/charging):

I just measured the worst case slowpath overhead (from the start of the fastpath to after the final slowpath_irq) to be 120 (8) cycles on the imx8mm. I can add numbers for other boards if that would be useful.

@Indanz
Copy link
Contributor

Indanz commented Mar 27, 2024

Nice graph, how did you collect those numbers?

In your new test, the while loop in high_prio_fn should be a while (i < N_RUNS), otherwise there is no bound on how far into results[] will be written. I also don't see how high_prio_fn is stopped currently, other than crashing. On non-MCS, I'm not sure how you can distinguish between kernel timer ticks and user space IRQs.

@danshea00
Copy link
Contributor Author

I used tracepoints to record each section of the slowpath and fastpath. I just ran this with the two benchmarks, so each 'split' is the mean of 110 runs.

Thanks for the feedback on the new benchmark. At the moment the test ends with high_prio_fn waiting on the notification when low_prio_fn exits it's loop and sends on the done_ep. I'll clean things up if I end up making a PR to seL4Bench.

Add an IRQ fastpath for aarch64 MCS. This fastpath is similar to the
signal fastpath, but also handles the case when the destination thread
is of higher priority than the interrupted thread, in which case a
direct context switch occurs.

Signed-off-by: Dan Shea <danshea00@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants