Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Use core language async primitives instead external crates #172

Closed
no111u3 opened this issue Nov 27, 2019 · 22 comments
Closed

[RFC] Use core language async primitives instead external crates #172

no111u3 opened this issue Nov 27, 2019 · 22 comments

Comments

@no111u3
Copy link
Contributor

no111u3 commented Nov 27, 2019

Summary

This proposal proposes to change the way how async api does implemented. It makes async api more clearly for use and extend.

Motivation

Our current approach to implement trait primitives as depends on nb crate. It was actually on start of this project but after version 1.36 rust already have async traits and methodology for work with it. So in 2018 edition of rust lang we have language words as async, await. The marcosses exported by nb are conflicted with key words, we need to use r# prefix for each macros use.

This RFC attempts to change implementation to core types based implementation. It will be more effective and more conform rust language ideas.

Detailed design

This RFC inspired of PR #13, but after apply external crate I redesign for use core functionality because it will be better than external depends.

For example we have interface for serial:

use core::task::Poll;

/// A serial interface
pub trait Serial {
    /// Error type associated to this serial interface
    type Error;
    /// Reads a single byte
    fn read(&mut self) -> Poll<Result<u8, Self::Error>>;
    /// Writes a single byte
    fn write(&mut self, byte: u8) -> Poll<Result<(), Self::Error>>;
}

It use core::task::Poll as point of async run status. It was easy for implement and it isn't depends on nb or another external crates.

Alternatives

Don't implement this RFC.

Uresolved questions:

  • More clear examples of right use async code.
  • How we need to update depend crates?
  • Do we need to rewrite nb prefer than rewrite async api of embedded-hal?
@Nemo157
Copy link

Nemo157 commented Nov 27, 2019

There are semantics attached to Poll::Pending that did not exist on nb::Error::WouldBlock, specifically:

When a function returns Pending, the function must also ensure that the current task is scheduled to be awoken when progress can be made.

Having functions that return core::task::Poll but do not take a core::task::Context and arrange to wake the task when necessary is likely to lead to deadlocks if they are used by systems built around core::future::Future.

@Nemo157
Copy link

Nemo157 commented Nov 27, 2019

@Nemo157 I understand that problem we have but in current implementation of api i couldn't change behavior of run without more breakable changes on api. I'm planning to find solution in this situation. Thanks.

Sure, I just want to make sure this isn't missed. I have done a lot of work on async and have experimented with using it on embedded devices in https://github.com/Nemo157/embrio-rs/ including playing around with alternative trait definitions, just want to offer whatever insights I might have (I am lacking in experience with using the existing embedded ecosystem much, I was only really focused on trying it out with async).

If there's a plan for breaking changes to the trait APIs it'd be best to do them all at once, which may involve adding an additional core::task::Context parameter to all the traits, and providing extension traits similar to how futures is designed:

use core::{pin::Pin, task::{Poll, Context}, future::Future};

/// A serial interface
pub trait Serial {
    /// Error type associated to this serial interface
    type Error;
    /// Reads a single byte
    fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<u8, Self::Error>>;
    /// Writes a single byte
    fn poll_write(self: Pin<&mut Self>, byte: u8, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>>;
}

pub trait SerialExt: Serial {
    fn read(&mut self) -> impl Future<Output = Result<u8, Self::Error>>;
    fn write(&mut self, byte: u8) -> impl Future<Output = Result<u8, Self::Error>>;
}

@no111u3
Copy link
Contributor Author

no111u3 commented Nov 27, 2019

It looks as real solution for me. I talk about current PR with @Disasm, he says: we can create clear path for migrate from old nb api to new async api through default implementation of async api.
But it may be freeze migration from old to new api.

@tustvold
Copy link

tustvold commented Dec 2, 2019

So the main problem I ran into when trying to do something similar in #149 was that as futures in rust are driven, the returned Poll<Result<u8, Self::Error> needs to borrow a reference to the underlying peripheral. Otherwise it would be possible to, for example, interleave writes to the same peripheral - which the current API prevents you from doing.

One can work around this by moving the peripheral itself into the future and only returning it on completion, but this makes for an incredibly obtuse API that makes simple constructs such as loops hard.

It is this precise issue with ergonomics that was one of the motivators behind adding native async support to Rust in the first place - see here.

It may be that someone smarter than myself can find solutions to these problems, but thought I'd clarify them here. Depending on the timelines for generalized associated types and generator resume arguments, it might be worth waiting for them to to bring native async support to traits and [no_std] respectively, before making breaking changes that would only need to be followed by more breaking changes.

@Nemo157
Copy link

Nemo157 commented Dec 2, 2019

The returned Poll shouldn't need to borrow the peripheral, it should be higher level APIs like fn read(&mut self) -> impl Future<Output = Result<u8, Self::Error>>; which borrow the peripheral in the returned Future in order to avoid interleaved reads/writes. The Poll returning methods are really low-level building blocks which don't need to provide full compile-time guarantees, the majority of users should be accessing them through the Future returning functions.

One of the nice things about the split between low-level traits of Poll functions and high-level utilities returning Future used in futures is that it avoids any need for GATs. And there are a lot of other issues with attempting to create core traits that return futures, see rust-lang/futures-rs#1365 for a lot of relevant discussion.

(Generator resume arguments will not affect any public API related to Future/async, it will only mean that async fn becomes usable on no_std, at least until some very far off day when generators themselves might be stabilised).

@therealprof
Copy link
Contributor

@Nemo157 Do you have an example for a Future executor which can run on embedded? Still trying to figure out our options; it would be kind of nice if we could spawn new futures anywhere and have the rest of the main loop be a poller for all futures. That would probably require a global future queue where interrupts and other functions can add their futures to?

@tustvold
Copy link

tustvold commented Dec 2, 2019

@Nemo157 You are completely correct, I had a brain fart and conflated Poll<Result<u8, Self::Error>> and impl Future<Output = Result<u8, Self::Error>>. GATs are only needed if trying to provide traits with methods that return futures.

The majority of users should be accessing them through the Future returning functions

So is the idea that driver crates would then provide methods returning futures? How would people compose and run these without the afformentioned async support for no_std? You could provide the block, await, etc... macros that the nb crate provides for its futures, but moving away from these was one of the motivators for the RFC?

Edit: I believe you would still need the macros nb provides to aid the writing of the driver crates, and then would additionally need to provide an executor implementation for people to run the futures and possibly further facilities to aid in composing them.

@Nemo157
Copy link

Nemo157 commented Dec 3, 2019

There is https://docs.rs/embedded-executor/0.3.1/embedded_executor/, I haven't used it myself, but it was developed as part of https://gitlab.com/polymer-kb/polymer.

How would people compose and run these without the afformentioned async support for no_std?

I think moving to core::future::Future only makes sense when intended to be used with async. The current implementation issue blocking it being used in no_std has workarounds (core-futures-tls using ELF thread-locals, embrio-async for a re-implemented transform, worst case an unsound copy of core-futures-tls that abandons thread-safety). And if there is consensus and experiments in moving some of the embedded ecosystem to async that might provide the incentive to get someone to work on fixing the compiler to support it.

@tustvold
Copy link

tustvold commented Dec 3, 2019

@Nemo157 So I'm a little bit confused as to what you are proposing exactly, as far as I can see it these are the following options

  1. This RFC gets accepted and things remain similar to how they are currently but we lose the utility block, await, etc... macros that are currently provided by nb
  2. This RFC gets extended to provide macros similar to those provided within the nb crate, but named in such a way as to not collide
  3. This RFC is blocked until async support comes to no_std allowing writing and interacting with methods returning core::future::Future. A utility macro could be provided to facilitate wrapping the embedded_hal methods returning poll types into core::future::Future types that can then be used with async
  4. The macros within nb are renamed to not collide with keywords
  5. The RFC is rejected

I think moving to core::future::Future only makes sense when intended to be used with async.

And I would argue moving to core::task::Poll only really makes sense if you're going to move to using core::future::Future.

@baloo
Copy link

baloo commented Dec 28, 2019

I got a working proof of concept of async/await on a stm32f429 mcu using #172 (comment) approach.
I used:

  • embrio-rs's Executor (slightly modified not to depend on rest of embrio)
  • made a simple fork of core-futures-tls to use a static mut to store the Context
  • used Switch async code to use futures crate #171 and ported stm32f4xx-hal to it.

Being not too knowledgeable about RFC process, should I share?

@therealprof
Copy link
Contributor

Being not too knowledgeable about RFC process, should I share?

Yes, please!

@baloo
Copy link

baloo commented Dec 28, 2019

baloo/pm-firmware@8ccdfd2

Some remarks:
I only implemented support for UART for now, because that's what I used to debug.
I used git submodules to ensure I you get the same versions of the libs I was running.
I reused the example from embrio-rs found here:
https://github.com/Nemo157/embrio-rs/blob/master/examples/apps/hello/src/lib.rs

@Matthias247
Copy link

I fully support here what @Nemo157 already mentioned. If APIs return Poll they should also provide abilities to store Waker objects and make it possible to use peripherals without busy looping. I actually think the whole Waker concept should also work decently well on embedded. Wakers can be implemented in flexible fashions, and certainly also in ones where interrupt handlers perform the waking. They should allow for non-busy waiting on bare-metal platforms as well as RTOS platforms if the HALs are properly implemented.

Regarding borrowing a peripheral: As @Nemo157 already mentioned, the Poll does not need to borrow it. But if the using the peripheral is modeled as an async fn which requires &mut self only one task can access it at the same time. Also if peripherals are implemented in a fashion where they can only store a single Waker, then using the peripheral from multiple places concurrently would either starve a task (by overwriting it's Waker) or lead to continuous swapping of Wakers back and forth. I doubt that's a huge problem in practice, since multiple tasks having ownership of peripherals is an anti-pattern anyway, and you might rather want to have a coordinator task in that case.

However for things like GPIO or UART you can also solve the multiple Waker problem in different fashion: Instead of using

pub trait Serial {
   /// Reads a single byte
    fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<u8, Self::Error>>;
}

where poll_read needs to store a Waker, we can delegate notifications to an async ManualResetEvent (e.g. the one in futures-intrusive), which can wake up an arbitrary amount of tasks using a fixed amount of memory. One of those objects would need to get embedded into the necessary peripheral. And the event is set from inside ISRs when the peripheral is ready. The API could then look like:

pub trait Serial {
    // Obtain a reference to the associated readiness event
    fn read_readiness_event(&self) -> &ManualResetEvent;
   /// Try to read bytes
    fn try_read(&self, mut buffer: &[u8]) -> Poll<usize>;
}

In that case the peripheral could be used along:

while !done {
    serial.read_readiness_event().wait().await;
    let received_bytes = serial.try_read(&buffer);
}

Multiple tasks can run that coode in parallel. Whether that's desirable or not is certainly a different question. But I think that one is best answered for each peripheral independently.

@dflemstr
Copy link

dflemstr commented Mar 13, 2020

Hey all, I just wanted to mention that I have been experimenting with this model here: https://github.com/dflemstr/embedded-platform/

It contains async traits for most of the things in embedded-hal, using a very tokio-esque future model. To get a taste, here is the i2c module: https://github.com/dflemstr/embedded-platform/blob/master/src/i2c.rs

I think we could just copy-paste from that library, I took care to make the traits very re-usable.


I also created https://github.com/dflemstr/direct-executor which is an executor that can run without alloc, however it doesn't support spawn and it uses interrupts/wfi heavily. Here's how it's used in the embedded-platform implementation of nrf52840: https://github.com/dflemstr/embedded-platform/blob/master/platforms/nrf52840/src/lib.rs#L42-L58

@ryankurte
Copy link
Contributor

hey folks, this (specifically _whether to switch from nb to core, how to implement async only insofar as it impacts this) is the major issue blocking us on a v1.0.0 release (#177), so i am wondering a) whether we can make a decision about this prior to actually building out the async e-h traits or are we blocked on this and b) what is the path to demonstrating / coming to consensus on this? and c) would it be useful for us to facilitate another video call to talk about this?

Related work:

  • @Disasm has demonstrated async traits using the unstable #![feature(generic_associated_types)] with default methods based on existing nb base traits here
  • @dflemstr has demonstrated async traits using core with a parallel set of traits (and no default methods) here with an implementation here

@chrysn
Copy link
Contributor

chrysn commented Feb 3, 2021

I've triggered an analogous discussion at rust-embedded-community/embedded-nal#47 and would like to update this discussion based on it (some was implied in posts but it may help to summarize):

  • The issues around thread locals and other std dependencies in futures themselves are gone by now.
  • For applications that want to preserve the nb style, a simple executor can do. In writing the demo for the abovementioned issue, I ended up with an executor very similar to @dflemstr's https://github.com/dflemstr/direct-executor/ (with differences on the level of "how to name variables" and "leave the loop using return or using break").
  • Many applications now using WouldBlock profit from not having to store state because they can idempotently call the same function again. This is neat but IMO also dangerous because it's easily overlooked that a later ? return WouldBlock can discard an earlier result's state.
    To my knowledge there's no satisfyingly simple solution, but maybe that's not a top priority issue either. (There is a workaround, but it is manually implementing a different state machine than we usually have the compiler generate).
  • The issue of when and where to store the wakers is one we should think through with a focus on how embedded-hal implementations can do it. So far, implementing a UART character receive can look at a register flag and nope out. If it is Future::poll()ed, it would suddenly need to not only store the waker, but also start messing with interrupts -- something embedded-hal implementations so far can get away without.
    Would it, in the interest of keeping simple things simple, be permissible for something that knows itself to take only finite time (eg. a UART transmit, or an ADC read) be OK for an implementation to not store the waker at all, but to call it right away in the poll hook and thus force the user into a spin loop? (It would then be a topic of implementation quality for the crate to decide whether to set up an interrupt, just call the waker, or even to know that the delay is so short and sure to complete that the poll function rather busy-loops for 10 CPU cycles than call the waker and take a lot more cycles for the executor to come around).
  • The waker is (at least with the current options of building one) constructed using dynamic dispatch. This may put us into a similar situation as we have with formatting where we know that there's only one and a lot of const-propagation could make things easier, but it's forced through a vtable and the compiler can't help.
    I don't expect we do anything about this now, but we should have it on screen as a possible source of regressions compared to the more minimalistic main loops nb allows us.

(I'm not touching on the topics of how to do things through traits or GAT here; that's described well in the last post.)

@chrysn
Copy link
Contributor

chrysn commented Feb 8, 2021

Before we commit on a road that works around GATs (and can not be trivially upgraded to use GATs; not sure whether that's the case with the current proposals), we may want to talk to wg-traits as there's been recent chat that they are making good progress [edit: around rust-lang/rust#44265 (comment)].

@lachlansneff
Copy link
Contributor

I've been messing around with this and GATs are pretty much fully usable for returning futures from trait methods at this point. All the ICEs seem to be worked out.

@eldruin
Copy link
Member

eldruin commented Jun 22, 2021

Now that we have a clear structure for the execution models (blocking/nb), in principle I would be open to add a nightly-only futures module behind a futures (or similar) feature.
I think this would not reintroduce the problem we had with unproven since futures would require nightly so I think users would not enable it lightheartedly.
As per this these would need to be in a "proven stable" state before we add them here.
However, since it builds only on nightly, theoretically it would already be subject to breakage at any time so we might either be a little more flexible around this to encourage adoption or keep the "can't touch this" status quo, especially considering that GAT will probably not hit stable in a long time.
Note that "being a bit more flexible" would probably imply some kind of SemVer guarantee waiving for the futures module until it can be built on Rust stable.

To my knowledge what comes closest to this state is embassy-traits. However, from what I see, embassy is still under heavy development.
cc: @Dirbaio, @thalesfragoso

@burrbull
Copy link
Member

#285

@Dirbaio
Copy link
Member

Dirbaio commented Jun 22, 2021

Would it, in the interest of keeping simple things simple, be permissible for something that knows itself to take only finite time (eg. a UART transmit, or an ADC read) be OK for an implementation to not store the waker at all, but to call it right away in the poll hook and thus force the user into a spin loop?

Yes, this is allowed by the Futures trait, although I'd argue that'd be a quite low-quality implementation. Unusable for low-power applications. It'd work okay, it's even "fair" as in all the other tasks get a turn to run before the self-waking task runs again in a multi-task executor.

Embassy does hook up the interrupts to wakers, although this means it needs a whole new infrastructure for drivers to "own" interrupts that hasn't been needed in current HALs so far.

The waker is (at least with the current options of building one) constructed using dynamic dispatch. This may put us into a similar situation as we have with formatting where we know that there's only one and a lot of const-propagation could make things easier, but it's forced through a vtable and the compiler can't help.

Embassy mitigates this by having a custom WakerRegistration type to store wakers. It has 2 impls, an embassy-only one and an executor-agnostic one (enabled by a Cargo feature). The executor-agnostic one stores the waker, the embassy-only one assumes all wakers are embassy wakers and only stores the task pointer (1 word instead of 2) and does direct calls (no vtable).

For users that want to use busyloop pseudo-executors there could even be a third "nop" implementation that stores nothing and does nothing on wake.

@Dirbaio
Copy link
Member

Dirbaio commented Mar 14, 2023

This can be closed now that embedded-hal-async exists.

@Dirbaio Dirbaio closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests