Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move allow_readwrite and allow_readonly from SyscallDriver into common Grant framework #2906

Merged
merged 25 commits into from
Dec 13, 2021

Conversation

jettr
Copy link
Contributor

@jettr jettr commented Nov 19, 2021

Pull Request Overview

Similar to how we removed subscribe from the SyscallDriver trait in favor of the kernel always managing the data, do the same for read-only and read-write process buffers.

This has multiple advantages

  • It reduces boilerplate code in almost every SyscallDriver implementation
    • Each boiler plate implementation of allow_readonly or allow_readwrite costs about 170 bytes of flash
  • It allows libtock-rs implementation to rely on the fact that there cannot be improperly behaving capsule that holds on to process buffers when application calls allow with 0 (i.e. unsubscribe)

Note each commit builds and compiles (but isn't necessarily functional)

Testing Strategy

So far, I have tried this code out locally with a modification in kernel that handles both the SyscallDriver and new Grant-based process buffer allows. I didn't upload that, but I was able to convert a few capsules over to the new approach and they worked correctly.

Once we migrate all of the capsules, we would need to test again of course

TODO or Help Wanted

  • Help with converting the reset of the capsule code to use the new common framework style of accessing buffers. So far I only converted ADC as an example. Once people agree this is a good way forward, we would need to convert the rest of the capsules
    • boards/
    • capsules//.rs (files in subfolders within capsules)
    • capsules/[a-c]*.rs
    • capsules/[d-i]*.rs
    • capsules/[j-m]*.rs
    • capsules/[n-r]*.rs
    • capsules/[a-z]*.rs
  • Once all capsules are converted, we should remove the allow_readonly and allow_readwrite function on the SyscallDriver trait
  • Potential cleanup: we may no longer need the Default implementation on ReadOnlyProcessBuffer and ReadWriteProcessBuffer since they no longer needed in a capsule specific grant region, which required Default.
  • Potential cleanup: if we can remove Default and impl for process buffers, then we should be able to directly put a lifetime constraint on the process buffer structs themselves. This will allow us to drop the Refs versions and permit the unsafe callers of new to more easily uphold the invariant of the pointer lifetime
  • Potential cleanup: add a access_grant_inner function to de-monomorphize if needed for size reduction.

Documentation Updated

  • Updated the relevant files in /docs

Formatting

  • Ran make prepush.

@jrvanwhy
Copy link
Contributor

  • It allows libtock-rs implementation to rely on the fact that there cannot be improperly behaving capsule that holds on to process buffers when application calls allow with 0 (i.e. unsubscribe)

In order for libtock-rs to rely on this implementation, we would need to supercede or modify the following wording from TRD 104:

Note that these requirements are typically on capsule code, and are not
enforced by the core kernel. Because capsules may have bugs or not be fully
trusted, userspace SHOULD NOT rely on the fact that drivers follow these rules as
described, especially if the capsule code is not independently checked and tested.
Instead, userspace SHOULD employ additional checks to ensure that
drivers behave as specified, and implement proper error handling as needed.
The core kernel MAY enforce a subset of these
specified rules in the future.

@phil-levis Would modifying TRD 104 or adding a new TRD be the correct process to do this?

Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the design overall, most of these comments are nits. I will try to take a pass on porting some more capsules sometime this week

One high-level comment:
There is a lot of code de-duplication in this PR which is largely separate from adding the actual allow implementations. For enter_grant_kernel_managed() it makes sense from a code size perspective. De-duplicating the code in access_grant() and access_grant_with_allocator() is nice, though I do wonder what the implications are / if there was a reason those were initially duplicated code instead of what you have done. Most boards will never use the allocator so that is something to think about.

Regarding your potential cleanups:

  • getting rid of Default for the ProcessBuffer types would be nice, and I don't immediately see any reason we shouldn't. Being able to drop the ref types in that case would be a win.
  • I did try adding an access_grant_inner() when I first made a de-monomorphization pass on the grant code, and found doing so actually increased code size because there was not enough that could be taken out into the inner function to offset the extra overhead of a function call in the outer one. That may be worth revisiting though, given how much this file has changed. Either way I think that may be best left for a followup PR.

kernel/src/kernel.rs Outdated Show resolved Hide resolved
kernel/src/kernel.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/kernel.rs Outdated Show resolved Hide resolved
@jettr
Copy link
Contributor Author

jettr commented Nov 23, 2021

One high-level comment: There is a lot of code de-duplication in this PR which is largely separate from adding the actual allow implementations. For enter_grant_kernel_managed() it makes sense from a code size perspective. De-duplicating the code in access_grant() and access_grant_with_allocator() is nice, though I do wonder what the implications are / if there was a reason those were initially duplicated code instead of what you have done. Most boards will never use the allocator so that is something to think about.

Boy scout motto: always try to leave the place cleaner than when you found it :) Also it pains me to copy and paste code when it is reasonable to de-dup it. There is much less potential for future mistakes. I also couldn't find any good reason to not de-dep in this way. I am actually now thinking about have make the allocator version an internal one that takes in its closure by dyn reference, and that have the two public method call the internal one with the dynamic dispatch. That would cut down on monomorphism cost (most likely -- still would have to profile size changes) since we only monomorphize on return type instead of each unique closure instance. I agree that this should be in a follow up PR too.

Thanks for being willing to help out to convert some of these files! I figured we could just take an alphabetic chunk of the files. When someone wants to take a batch of files, please comment here first so we don't duplicate effort. I am going to take a batch of files later today since it seems like there aren't any major red flags with this PR. I will post the chunk I take then.

@jettr
Copy link
Contributor Author

jettr commented Nov 23, 2021

  • It allows libtock-rs implementation to rely on the fact that there cannot be improperly behaving capsule that holds on to process buffers when application calls allow with 0 (i.e. unsubscribe)

In order for libtock-rs to rely on this implementation, we would need to supercede or modify the following wording from TRD 104:

Note that these requirements are typically on capsule code, and are not
enforced by the core kernel. Because capsules may have bugs or not be fully
trusted, userspace SHOULD NOT rely on the fact that drivers follow these rules as
described, especially if the capsule code is not independently checked and tested.
Instead, userspace SHOULD employ additional checks to ensure that
drivers behave as specified, and implement proper error handling as needed.
The core kernel MAY enforce a subset of these
specified rules in the future.

@phil-levis Would modifying TRD 104 or adding a new TRD be the correct process to do this?

Yay, this is still a draft! I added a commit that updates this draft TRD. I was going to add some text, but I thought the cleanest way to modify it was to actually just remove the paragraph that Johnathan mentioned completely; this matches the way the subscribe section is worded by not calling to attention that the application can rely on this behavior.

@hudson-ayers
Copy link
Contributor

Boy scout motto: always try to leave the place cleaner than when you found it :) Also it pains me to copy and paste code when it is reasonable to de-dup it. There is much less potential for future mistakes. I also couldn't find any good reason to not de-dep in this way. I am actually now thinking about have make the allocator version an internal one that takes in its closure by dyn reference, and that have the two public method call the internal one with the dynamic dispatch. That would cut down on monomorphism cost (most likely -- still would have to profile size changes) since we only monomorphize on return type instead of each unique closure instance. I agree that this should be in a follow up PR too.

My main concern here was that having access_grant_with_allocator always be called would lead to code associated with the GrantRegionAllocator being included in the binary, despite that type not actually being used by the caller of access_grant_with_allocator(). I just tested it and it looks like LLVM's dead code elimination is smart enough to prevent that, so I am happy with how this looks.

Monomorphizing only on return type would be an interesting optimization for size -- many closures probably use the same few return types (e.g. Result<(), ErrorCode> or ()) -- but this would definitely add a runtime cost to almost every system call, since those closures could no longer be inlined across the call to Grant::enter(). Profiling both the size and cycle overhead would be important, imo.

Thanks for being willing to help out to convert some of these files! I figured we could just take an alphabetic chunk of the files. When someone wants to take a batch of files, please comment here first so we don't duplicate effort. I am going to take a batch of files later today since it seems like there aren't any major red flags with this PR. I will post the chunk I take then.

Alphabetic sounds good to me, I won't be touching anything today so I can take whatever chunk comes after yours.

Copy link
Contributor

@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through grant.rs and kernel.rs pretty closely and they look good to me.

The only thing I'm not 100% about is GrantEnterLifetimeGuard. process.leave_grant(grant_num); is only one line and provides a clear window between when the grant is entered and left. Adding another type and instructing the caller to hold on to the type seems like extra complexity for not a lot of benefit.

kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/grant.rs Outdated Show resolved Hide resolved
kernel/src/kernel.rs Outdated Show resolved Hide resolved
kernel/src/process.rs Outdated Show resolved Hide resolved
@jettr
Copy link
Contributor Author

jettr commented Nov 23, 2021

I went through grant.rs and kernel.rs pretty closely and they look good to me.

Thank you for taking a thorough look at it! I appreciate it!

The only thing I'm not 100% about is GrantEnterLifetimeGuard. process.leave_grant(grant_num); is only one line and provides a clear window between when the grant is entered and left. Adding another type and instructing the caller to hold on to the type seems like extra complexity for not a lot of benefit.

It seems more dangerous (from incorrect behavior POV) to have the unmatched enter and leave grants occur because someone accidentally used the ? operator between enter and leave calls. Using an RAII object guards against this easy-to-make mistake in the future.

Should we make the guard hold the pointer, so you have to access the guard to get the pointer? I will upload a commit for that.

@jettr
Copy link
Contributor Author

jettr commented Nov 23, 2021

Should we make the guard hold the pointer, so you have to access the guard to get the pointer? I will upload a commit for that.

That doesn't work as well as I thought for the two places we use it. I would want to wrap a KernelManagedLayout in only place and the raw pointer in the other.

@jettr
Copy link
Contributor Author

jettr commented Nov 23, 2021

I converted two files. These conversions take more time than I would have hoped. I do like the paradigm of creating something like the following at the top of the file. Thoughts?

/// Ids for read-only allow buffers
mod ro_allow {
    pub const WRITE: usize = 0;
    /// The number of allow buffers the kernel stores for this grant
    pub const COUNT: usize = 1;
}

/// Ids for read-write allow buffers
mod rw_allow {
    pub const READ: usize = 0;
    pub const CFG: usize = 1;
    pub const RX_CFG: usize = 2;
    /// The number of allow buffers the kernel stores for this grant
    pub const COUNT: usize = 3;
}

@hudson-ayers
Copy link
Contributor

I converted two files. These conversions take more time than I would have hoped. I do like the paradigm of creating something like the following at the top of the file. Thoughts?

/// Ids for read-only allow buffers
mod ro_allow {
    pub const WRITE: usize = 0;
    /// The number of allow buffers the kernel stores for this grant
    pub const COUNT: usize = 1;
}

/// Ids for read-write allow buffers
mod rw_allow {
    pub const READ: usize = 0;
    pub const CFG: usize = 1;
    pub const RX_CFG: usize = 2;
    /// The number of allow buffers the kernel stores for this grant
    pub const COUNT: usize = 3;
}

Thanks for doing the networking capsules, those are probably 2 of the harder ones to take on.

I am a fan of that paradigm

@jettr
Copy link
Contributor Author

jettr commented Nov 24, 2021

That was a marathon! I finished converting everything over and removed the allow_readonly and allow_readwrite function from the SyscallDriver interface.

I pulled out conversions to their own commit if there were any non-trivial changes-- specifically ipc, touch, screen, and bluetooth.

Looks like we can't immediately remove the Default implementation for the process buffers because the UserspaceReadableProcessBuffer is still using the SyscallDriver mechanism, and I think this PR is already large enough without trying to tackle that too :)

Assuming everything looks okay (I am assuming there will be feedback to incorporate first), this is ready to land!

@jettr
Copy link
Contributor Author

jettr commented Nov 29, 2021

Looks like I already have a rebase conflict, should we leave it for now and do one rebase and force push at the end of the review or should I rebase and force push now (or try to merge mainline changes into this branch)?

@lschuermann
Copy link
Member

Looks like I already have a rebase conflict, should we leave it for now and do one rebase and force push at the end of the review or should I rebase and force push now (or try to merge mainline changes into this branch)?

I suppose whatever works best for you. This touches a large number of files, sometimes it's easier to keep on track with smaller, more frequent conflicts.

@hudson-ayers
Copy link
Contributor

Surprisingly, this only saves 530 bytes of code size on Imix. While that is still a win, I must admit I was expecting a little more

@hudson-ayers hudson-ayers added the P-Significant This is a substancial change that requires review from all core developers. label Nov 29, 2021
@jettr
Copy link
Contributor Author

jettr commented Nov 29, 2021

This is surprising, I was seeing about ~200 bytes savings for every allow_readonly allow_readwrite call that was omitted. I guess this favors the boards that have more driver. Also this helps make libtock-rs smaller, which will favor boards that have more apps. I do wish it saved more, but still think this is the correct path forward.

@hudson-ayers
Copy link
Contributor

I would like this PR even if there were no savings -- the reduction in duplicated code + correctness guarantees are really valuable. I am just surprised by the result

@jettr jettr mentioned this pull request Nov 29, 2021
Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried out the IPC libtock-c apps (rot13_service/rot13_client) with this branch, and they do not work. That should be fixed before this is merged. I tried to debug it briefly but did not discover the cause -- I should have some time tomorrow to look further

kernel/src/ipc.rs Outdated Show resolved Hide resolved
kernel/src/ipc.rs Show resolved Hide resolved
@jettr
Copy link
Contributor Author

jettr commented Nov 30, 2021

We might just want to try out the alternate approach for IPC that leaves the handles unchanged. This does mean that every readwrite_allow call into the kernel will check for the IPC driver number for the special case logic, but I think that might be an okay cost to pay. Thoughts?

@jettr
Copy link
Contributor Author

jettr commented Nov 30, 2021

I figured out what is wrong with the IPC implementation: the ipc_register_service_callback in libtock-c hard codes a 0 for the target_id to indicate service when it calls subscribe. If we instead took in the service name and performed a discover on itself, and use that number instead of hard coding 0, then it would work. That seems like a better API, but it is a breaking change for the IPC driver unfortunately.

Is it okay to change the libtock-c implementation right now or should I put in the special case handling code for IPC in kernel?

@jettr
Copy link
Contributor Author

jettr commented Nov 30, 2021

I made the speculative change for IPC in a libtock-c PR. This seems like the way to go. This will also let us drop the NUM_CALLS const generic parameter on IPC in favor of using NUM_PROCS directly for both allow and upcalls.

@hudson-ayers
Copy link
Contributor

My instinct is that your fix is preferable -- I think we should avoid special case handling for IPC if at all possible, especially since there are a number of issues with IPC that we would still like to eventually resolve, and given that plenty of boards choose not to include IPC, and we would like to keep that decision zero cost. IPC is not a stable driver, so I think we can update it without a major version bump, but I think a minor version bump might be prudent. That said, we are probably due for a minor version bump reasonably soon anyway. Removing the extra upcall from the IPC driver is a nice added bonus.

@bradjc
Copy link
Contributor

bradjc commented Nov 30, 2021

Changing IPC is fine. We don't guarantee any stability so we don't have to worry about version numbers.

@jettr
Copy link
Contributor Author

jettr commented Dec 1, 2021

All tests should be passing now with the libtock-c PR change! Everything is rebased on ToT and merge conflicts are resolved.

jettr and others added 12 commits December 13, 2021 08:52
To update screen, it required non-trivial changes:
 - Allowing a buffer no longer check for the correct length
   - This check has been moved to the fill and write operation
 - Setting a new buffer while writing, will not start at the beginning
of the buffer. This is okay because the documentation calls this out as
undefined behavior

Change-Id: Ia5ca9fd2dca559c990c83a565fe2fef8b7904951
To update touch, it required a non-trivial change:
 - Allowing a buffer no longer checks for multi touch support

Change-Id: Ia5ca9fd2dca559c990c83a565fe2fef8b7904951
These files did not require any special changes

Change-Id: Icd0b521e93e183ef5adafe1121f97720ae3d98c9
To move IPC over to kernel-based allow, the following significant
change was made:
 - The target id returned by discover and used to notified is now the
direct index number in the process array instead of an identifier
   - This means that the target process could restart and the sending
process would not have to re-register the allow buffer again. This
actually seems like behavior we might want since the sending process only
has the granularity to discover by process name which only has the
granularity of index within the process array.

From the caller (application) POV, nothing should have changed. They
still need to discover a target_id, then register an allow buffer and
notify with that target_id. They same flow will work before and after
this change.

Change-Id: Ia0a7a54d464fe199be11552de71b36956e79fa5a
Now that kernel has taken over the read-only and read-write allow
operations, remove the function from the SyscallDriver interface

Change-Id: Id563e34c2d29baadc5879646a08ff776f85bf3bb
After reworking IPC implementation to use kernel-based allow framework,
we only need NUM_PROCS upcalls instead of NUM_PROCS + 1 upcalls. We can
remove the NUM_UPCALLS_IPCS concept everywhere.

Change-Id: I4a280a935505d57ddf6c68b666f3e25fba7e89b1
The allows function set the active app, but we can no longer do that.
Instead we should set the active app variable in the command right
before we make the uart call that will trigger the callback which uses
the active_app variable.

Change-Id: Ia01a4c2ab669b46fcec986343b69bf5f1f269eb4
Now that the kernel handles allowing buffers, we can be more certain
about which return code are return for which scenarios. Add this
information to the documentation.

Change-Id: I6c704f53279fd66e4a9c7e0f1529fc284f61c5cf
Change-Id: I624c87e4a49851299f96939d15c35d3bb5b515fe
Update comments to match new allow mechanics, and try to keep
consistency.
The `Process::enqueue_task` function does not provide any information
on whether or not a scheduled callback is a null-callback.
Furthermore, we are scheduling an IPC task, which will ultimately get
routed back to us. This will always succeed as long as there is
sufficient space in the Process task queue and there are no other
kernel-internal errors. Hence this removes a misleading comments about
being able to handle null-callbacks differently here.

Signed-off-by: Leon Schuermann <leon@is.currently.online>
@jettr
Copy link
Contributor Author

jettr commented Dec 13, 2021

@jettr Would you mind rebasing this PR on master? There are no conflicts.

Done. And thank you for figuring out the issue with libtock-c!

bradjc
bradjc previously approved these changes Dec 13, 2021
bors bot added a commit to tock/libtock-c that referenced this pull request Dec 13, 2021
271: ipc: allow registering client callback with svc_id = 0 r=hudson-ayers a=lschuermann

### Pull Request Overview

With the changes of 321a5a0 ("ipc: register service callback with service name") and tock/tock#2906, it is possible for a service to be assigned the svc_id = 0. Thus we must allow clients to register callbacks with this svc_id.

### Testing Strategy

This pull request was tested by running the failing LiteX simulation CI workflow locally. This resolves one of the issues of the failing CI.


### TODO or Help Wanted

This pull request does not fix all of the issues of tock/tock#2906. Somehow these changes appear to affect either the load order or scheduling behavior of the rot13_{service,client} apps. Because IPC service discovery seems fundamentally racy, with currently no reliable way to tell whether notifying a service worked, it requires further changes to work reliably.


Co-authored-by: Leon Schuermann <leon@is.currently.online>
@lschuermann
Copy link
Member

And thank you for figuring out the issue with libtock-c!

No worries! And wohoo, the CI is happy again!

Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

finally!

@bors
Copy link
Contributor

bors bot commented Dec 13, 2021

@bors bors bot merged commit 2763f02 into tock:master Dec 13, 2021
Copy link
Member

@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over most of this PR, especially the kernel code and it looks pretty good to me. Great work @jettr!

@lschuermann
Copy link
Member

welp... it appears i was a little to slow. :)

elenaf9 added a commit to elenaf9/tock that referenced this pull request Jul 2, 2023
tyler-potyondy pushed a commit to tyler-potyondy/libtock-c that referenced this pull request Mar 13, 2024
270: ipc: register service callback with service name r=hudson-ayers a=jettr

Use package name of service to register the service callback. This makes it more clear API and makes the kernel side easier.

It will allow us to drop any special case code in kernel for IPC and it will most likely allow us to drop the `NUM_UPCALLS` const generic as well (and we can just use `NUM_PROCS` for both upcall and allows)

See [RFC: Move allow_readwrite and allow_readonly from SyscallDriver into common Grant framework][0] for motivation

[0]:tock/tock#2906

Co-authored-by: Jett Rink <jettrink@google.com>
tyler-potyondy pushed a commit to tyler-potyondy/libtock-c that referenced this pull request Mar 13, 2024
With the changes of 321a5a0 ("ipc: register service callback with
service name") and tock/tock#2906, it is possible for a service to be
assigned the svc_id = 0. Thus we must allow clients to register
callbacks with this svc_id.

Fixes: 321a5a0 ("ipc: register service callback with service name")
Signed-off-by: Leon Schuermann <leon@is.currently.online>
tyler-potyondy pushed a commit to tyler-potyondy/libtock-c that referenced this pull request Mar 13, 2024
271: ipc: allow registering client callback with svc_id = 0 r=hudson-ayers a=lschuermann

### Pull Request Overview

With the changes of 321a5a0 ("ipc: register service callback with service name") and tock/tock#2906, it is possible for a service to be assigned the svc_id = 0. Thus we must allow clients to register callbacks with this svc_id.

### Testing Strategy

This pull request was tested by running the failing LiteX simulation CI workflow locally. This resolves one of the issues of the failing CI.


### TODO or Help Wanted

This pull request does not fix all of the issues of tock/tock#2906. Somehow these changes appear to affect either the load order or scheduling behavior of the rot13_{service,client} apps. Because IPC service discovery seems fundamentally racy, with currently no reliable way to tell whether notifying a service worked, it requires further changes to work reliably.


Co-authored-by: Leon Schuermann <leon@is.currently.online>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel P-Significant This is a substancial change that requires review from all core developers. WG-OpenTitan In the purview of the OpenTitan working group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants