Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for Async I/O migration #1065

Closed
11 of 15 tasks
jebrosen opened this issue Aug 2, 2019 · 64 comments
Closed
11 of 15 tasks

Tracking Issue for Async I/O migration #1065

jebrosen opened this issue Aug 2, 2019 · 64 comments

Comments

@jebrosen
Copy link
Collaborator

jebrosen commented Aug 2, 2019

Status

Rocket's master branch is currently fully asynchronous and supports async/await!

Pending

Done

Design

The async APIs are based on rust's async/await feature and std::future::Future, running on the tokio (0.2) runtime.

  • High-level APIs (routes and catchers):
    • Routes can now be defined as #[get|post|route(...)] async fn(params...) -> R where R is any Responder type. Routes that do not await anything can remain unchanged. #[catch]ers can also be defined as an async fn.
    • Methods on Data now return Futures that must be .awaited. As a consequence, any route that uses Data directly must be async. Routes that use data guards such as Json are not affected by this change.
    • Routes that make their own network requests or other I/O should be async, as it will allow other requests to continue while waiting for the operation.
  • Mid and low-level APIs (FromTransformedData and FromData, Fairing, Responder)
    • Several traits are now defined using async-trait, re-exported as #[rocket::async_trait].
    • Data is read asynchronously: DataStream implements tokio::io::AsyncRead instead of Read. Thus, FromTransformedData::transform and From(Transformed)Data::from_data are async fn or return Futures.
    • Response bodies are streamed to the client asynchronously: set_body and related methods expect tokio::io::AsyncRead instead of Read.
      • Consequently, Fairing::on_response and Responder::respond_to are now async fn. This allows Responders and Fairings to access and/or replace the body data.
  • The testing APIs have been moved to rocket::local::asynchronous and rocket::local::blocking. The blocking test API is simpler to use, and recommended when possible.
  • #[rocket::main], #[rocket::launch], and #[rocket::async_test] have been added to make async fn easily accessible for main() and test().
  • Many long-standing bugs with keep-alive, graceful shutdown (Clean shutdown? #180), performance, and more are fixed or will be easier to fix by using actively-supported versions of libraries.
  • Some new functionality will be possible to implement, such as SSE and WebSockets. These will likely not make it into Rocket 0.5. At this point, Rocket should not be missing any APIs necessary to implement SSE "by hand". WebSockets is a bit more complicated, and tracked in Native WebSocket support #90 .
@jhpratt
Copy link
Contributor

jhpratt commented Aug 2, 2019

The async_await feature is now required in addition to proc_macro_hygine and decl_macro

Wasn't decl_macro removed on the master branch a while back?

Currently, routes must either be fn -> R where R: Responder (non-async) or async fn -> R where R: Responder (async). Should we try to support (non-async) fn -> impl Future<Output=R> where R: Responder?

I don't see why not. Though it would be relatively trivial to create it async and await the returned value(s).

I'll take a look at uuid tomorrow to see if any changes are necessary. I suspect the answer is no.

@jebrosen Roughly how much of the documentation can be switched over to async? Are we going to primarily document in async, or will we essentially double everything up?

@jebrosen
Copy link
Collaborator Author

jebrosen commented Aug 2, 2019

Wasn't decl_macro removed on the master branch a while back?

Right! I'll fix that.

Currently, routes must either be fn -> R where R: Responder (non-async) or async fn -> R where R: Responder (async). Should we try to support (non-async) fn -> impl Future<Output=R> where R: Responder?

I don't see why not. Though it would be relatively trivial to create it async and await the returned value(s).

The issue is the return type of routes. Currently the codegen assumes that a non-async function returns some type R that implements Responder and an async function returns a Future whose output type is some R implementing Responder. Allowing functions to return either a Responder or a Future is hairy because both are traits and pretty soon we would end up wrestling with specialization. But I might have missed something in my analysis, and I would really like to support it if we can!

@jebrosen Roughly how much of the documentation can be switched over to async? Are we going to primarily document in async, or will we essentially double everything up?

The only place where there is a choice is routes; currently everything else is an async-only API and documentation should reflect that. I was going to wait a bit before pushing hard for docs, as there are likely a few design issues to be resolved with which APIs return which futures with what lifetime bounds.

My admittedly hand-wavy assumption was that we would explain traits like FromData and Responder in terms of async and futures, but use mostly non-async routes. We should definitely have a few examples of async routes, perhaps for HTTP requests or File I/O. For a lot of routes I do expect it to be as easy as changing fn to async fn and using .await in the right places.

Another really important point here is that manual FromData and Responder implementations should be relatively rare and are much more in the domain of API documentation than in the guide. I am personally more worried about the accessibility and understandability of the guide.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 2, 2019

The issue is the return type of routes. Currently the codegen assumes that a non-async function returns some type R that implements Responder and an async function returns a Future whose output type is some R implementing Responder. Allowing functions to return either a Responder or a Future is hairy because both are traits and pretty soon we would end up wrestling with specialization. But I might have missed something in my analysis, and I would really like to support it if we can!

That certainly makes sense. Adding support for -> impl Future would be backwards compatible, as it doesn't currently work. So that's a point that can certainly be postponed to see if there's even demand after async lands.

The only place where there is a choice is routes; currently everything else is an async-only API and documentation should reflect that. I was going to wait a bit before pushing hard for docs, as there are likely a few design issues to be resolved with which APIs return which futures with what lifetime bounds.

Probably best to hold off on docs for the time being, then.

@Nemo157
Copy link

Nemo157 commented Aug 2, 2019

compression - Not for the faint of heart; there might be some important APIs that are not yet available. HELP WELCOME

Quick advertisement for https://github.com/rustasync/async-compression, this should allow just drop-in wrappers to transparently compress/decompress AsyncRead/AsyncWrite instances. Not all encodings are supported for the read/write interfaces yet as the first framework it was used in used the impl Stream<Item = io::Result<Bytes>> adaptors instead, but it should be straightforward to expand them (at least once I merge the initial AsyncWrite support).

@theduke
Copy link

theduke commented Aug 2, 2019

@jebrosen this is awesome!

Just as a side note, both tokio and hyper master are working with std::future::Future now. It's still a WIP but in my testing the essentials are all working, so this could already be based on the master branches instead of dealing with the compat layer.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 2, 2019

contrib/lib/src/uuid.rs looks good to me as-is. It's just a thin wrapper around the uuid crate, so there's not really any room for making things async, unless we were to go all in and parse it asynchronously.

@alexsherman
Copy link

databases - We are not necessarily pushing for asynchronous database engines, but we should at least ensure they will be usable. postgres is a good candidate for exploring this.

Hi - what would you be looking for in a pull request to explore this? I've been developing some projects using Rocket and have been using both tokio-postgres and the synchronous rust-postgres, as well as their connection pools, and I would like to help if I think I know how.

@jebrosen
Copy link
Collaborator Author

jebrosen commented Aug 6, 2019

Hi - what would you be looking for in a pull request to explore this?

I have not tried too hard to test what "ensure they will be usable" entails. There are a few problems I can see ahead of time:

  • Futures-aware database engines probably need to be run within a specific runtime (i.e. tokio 0.1 or tokio 0.2). So once Rocket uses tokio 0.2 ((async) Use the futures 0.3 - compatible versions of hyper and tokio #1071), you wouldn't be able to use the current version of tokio-postgres. This is a non-obvious compatibility problem that isn't expressed in any type signature.
  • "Connect" might be an asynchronous operation. This seems to be the case for tokio-postgres, for example. So we should evaluate if making FromRequest return a future would be worth it for making databases (among other things) easier to work with.

At a minimum, I think we should be able to:

  • Create a connection pool with tokio-postgres and/or bb8 and put it in managed state. This should tell us if on_attach and on_launch need to be async-capable.
  • Create a request guard type wrapping a connection (or a future that will resolve to a connection). This and the previous bullet point together are more or less what the guide demonstrated for databases in Rocket 0.3 before we had rocket_contrib::databases.
  • Use said connection via the request guard within a route to query/alter the database. This should work fine, because routes can now be async fn.
  • Test, figure out, and document cross-runtime compatibility (or lack thereof). That is, a definitive statement in the form of "Rocket uses tokio 0.2, so your database futures have to as well" if that's the case. Alternatively, maybe we could find a proof of concept of running two runtimes - tokio 0.2 (for rocket) alongside tokio 0.1 (for tokio-postgres) with some kind of message passing between them. I leave this intentionally vague because it's an annoying solution to an annoying problem and someone other than I will probably think of a better idea.

@jebrosen
Copy link
Collaborator Author

jebrosen commented Aug 18, 2019

I am mostly waiting on #1071 (after a new tokio alpha and hyperium/hyper#1897) so that we don't need to significantly rewrite too much code that directly deals with the low-level bits -- which are pretty different between tokio 0.1 and 0.2. For example, I suspect #1070 will take almost as much work again as it would to implement it now.

At present all public APIs use async/await and futures 0.3, but everything runs on a tokio 0.1 runtime. This means that I/O types from tokio 0.2 cannot be used directly in routes yet, for example.


For anyone interested in kicking the tires pretty early on, I have written a short and possibly incomplete guide to trying projects against the async branch. At this stage the most helpful feedback would be examples of "things I can't do anymore" or that now require expensive allocations, especially in implementations of FromData or Fairing.

  • Depend on rocket = { git = "https://github.com/SergioBenitez/Rocket", branch = "async" } and rocket_contrib = { git = "https://github.com/SergioBenitez/Rocket", branch = "async" }
  • Add #![feature(async_await)] to your crate. You can remove #[feature(decl_macro)] thanks to Remove use of decl_macro by using a different hack. #964.
  • Typed headers are removed and a replacement has not been decided on ((async) Remove or replace typed headers #1067), so you will need to build headers manually and/or use raw headers at least for now
  • Implementations of FromData, Fairing, AdHoc::on_response, and Responder must be converted to use futures. For some cases, wrapping the function body in Box::pin(async move { }) will be enough. For some traits, just like with FromData for rocket_contrib::json::Json, you might need to copy some data out of the request before creating the future.
  • Some dependencies such as ring and handlebars have been updated, which may consequently break your application. You should get obvious build failures if this affects you.
  • You can now declare routes as async, and await futures inside of them.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 18, 2019

@jebrosen have you taken a look at async-trait? I know I've mentioned it before, and it reduces some of the overhead associated with async trait fns, like lifetimes and boxing. As a bonus, all we'd have to do after GATs are (some day) implemented is remove the attribute.

@jebrosen
Copy link
Collaborator Author

I did try async-trait today and ran into some issues that might be able to be worked around. I would prefer to use async-trait on either all affected traits or none of them, and I'm hopeful that it will be usable.

@Nemo157
Copy link

Nemo157 commented Aug 19, 2019

As a bonus, all we'd have to do after GATs are (some day) implemented is remove the attribute.

Unfortunately that would be a breaking change to any public traits, as far as I know there's no way to be forward compatible with GATs in traits. You can be forward compatible with using 'static bound async blocks in traits (once type_alias_impl_trait is stabilized) by using an associated type and setting that to be the pinned-box in implementations, e.g. skimming the changes it looks like FromDataSimple could be defined like this (and maybe FromData since it uses an external lifetime), but Fairing can't as it borrows from its arguments.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 19, 2019

Yeah, it would definitely be a breaking change, as APIs would be different. I was referring to the changes in the code itself, which would amount to deleting a few lines.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 19, 2019

Update on the attempt to use async-trait everywhere: it's currently not possible. There is a bug in the compiler that prevents async default fn or default async fn, which is necessary where specialization is currently used.

@nicoburns
Copy link

Test, figure out, and document cross-runtime compatibility (or lack thereof). That is, a definitive statement in the form of "Rocket uses tokio 0.2, so your database futures have to as well" if that's the case. Alternatively, maybe we could find a proof of concept of running two runtimes - tokio 0.2 (for rocket) alongside tokio 0.1 (for tokio-postgres) with some kind of message passing between them. I leave this intentionally vague because it's an annoying solution to an annoying problem and someone other than I will probably think of a better idea.

tokio-postgres has a std-futures branch with an in-progress port to tokio 0.2 (and std::future). We could probably just specify that that version is needed for compatibility.

@Centril
Copy link

Centril commented Aug 21, 2019

Update on the attempt to use async-trait everywhere: it's currently not possible. There is a bug in the compiler that prevents async default fn or default async fn, which is necessary where specialization is currently used.

I fixed this bug on nightly. Please make sure however that your uses of default async? fn are #[cfg(...)] gated in a separate module once you decide to make rocket work on stable.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 21, 2019

Thanks @Centril! I believe @jebrosen has some way to avoid needing specialization — I recall seeing it being eliminated in a PR or something like that.

Hopefully others can now use async-trait in more situations.

@jhpratt
Copy link
Contributor

jhpratt commented Aug 28, 2019

After attempting to convert r-spacex/Titan to the current HEAD of the async branch (dcea956), I ran into a glaring issue that needs to be addressed before even a prerelease. Currently, diesel (at least its PostgreSQL type) doesn't implement Sync, so it can't be used for a large swath of cases. @jebrosen has pointed out that it does implement Send, so it can be used in some cases.

While not directly related to Rocket, this is something we'll likely want to work with @sgrif et al. to ensure full compatibility.

@omac777
Copy link

omac777 commented Sep 3, 2019

I tried the async hello world it was working fine.
When I tried to convert it into a tls version so it resembles the tls example from the non-async rocket master, it was still serving up a http url. I confirmed this to be a true bug because if I just switch to the master branch within the Cargo.toml, it does enable tls and does serve up a https url.

[dependencies]
rocket = { git = "https://github.com/SergioBenitez/Rocket", features = ["tls"], branch = "async" }
# rocket = { git = "https://github.com/SergioBenitez/Rocket", features = ["tls"] }

THIS IS THE ASYNC BRANCH OUTPUT:
 $ ROCKET_ENV=dev ./target/release/helloasyncrocket 
🔧 Configured for development.
    => address: localhost
    => port: 8000
    => log: normal
    => workers: 24
    => secret key: generated
    => limits: forms = 32KiB
    => keep-alive: 5s
    => tls: enabled
🛰  Mounting /:
    => GET / (hello)
🚀 Rocket has launched from http://localhost:8000


I was expecting to see an output similar to the one from THE NON-ASYNC BRANCH OUTPUT behaving correctly:
$ ROCKET_ENV=dev ./target/release/helloasyncrocket 
🔧 Configured for development.
    => address: localhost
    => port: 8000
    => log: normal
    => workers: 24
    => secret key: generated
    => limits: forms = 32KiB
    => keep-alive: 5s
    => tls: enabled
🛰  Mounting /:
    => GET / (hello)
🚀 Rocket has launched from https://localhost:8000

@jebrosen
Copy link
Collaborator Author

jebrosen commented Sep 3, 2019

Yes, TLS is not implemented yet. I'll make that clearer in the main comment.

@Follpvosten
Copy link
Contributor

Follpvosten commented Oct 24, 2019

Not sure if this is the right place for such a question, but here we go:

I'm currently writing a backend for a web app with Rocket, using tokio-postgres directly at the moment (but this should apply to pools as well).
To remove boilerplate at the start of each handler which needs database access, I've implemented FromRequestAsync for my typed wrapper around the connection, so I can just fn handler(client: db::Client) -> ...
Since connecting to a database is an inherently expensive operation, this is an obvious use-case for request-local state; however, both establishing a connection with an async driver and retrieving one from an async pool are async operations, and local_cache currently doesn't have an async equivalent.

Would it be possible to implement such an async variant? I have no idea how much work would have to go into that, as far as I can tell, it probably wouldn't be as elegant as the current local_cache implementation, but it should be doable (taking a future instead of a closure and awaiting it once). If it's not too hard, maybe I could even contribute it.

EDIT, 10 minutes later:

pub async fn local_cache_async<T, F>(&self, fut: F) -> &T
    where F: std::future::Future<Output = T>,
            T: Send + Sync + 'static
{
    if let Some(s) = self.state.cache.try_get() {
        s
    } else {
        self.state.cache.set(fut.await);
        self.state.cache.get()
    }
}

I just want to say that I love async.

@jebrosen
Copy link
Collaborator Author

Why use local_cache for this at all instead of a connection pool? Do you need to use the same connection in many request guards and fairings while processing one route?

@Follpvosten
Copy link
Contributor

Follpvosten commented Oct 24, 2019

You're right, it's not a good example; I'm currently waiting for Rocket to implement async pooling support, that's why I use the connection directly.

However, as I said, this also applies to async pools, and if I want to implement something like FromRequestAsync for User and Admin with an async connection pool (to go with the example in the docs), I'll also want to query the database async. That should be the reason for providing an async local cache variant, not my specific example.

@jebrosen
Copy link
Collaborator Author

You can still implement FromRequestAsync for User with a connection pool, without the local cache (e.g. https://git.jebrosen.com/jeb/sirus/src/branch/async-db-pool-icky/sirus-server/src/guards.rs#L39). The local cache would save the (hopefully small!) expense of pulling from the pool multiple times, though, so it's still a good idea - I just don't see why it's necessary.

(side note: let's move this conversation to #1117)

@Razican
Copy link
Contributor

Razican commented Jan 22, 2020

Hi, I'm getting a bunch of Error: FATAL: sorry, too many clients already errors when running tests in the async branch. It seems that if I run a bunch of tests in parallel, I get the error.

@jebrosen
Copy link
Collaborator Author

@Razican Do you know what code that is ultimately coming from? As far as I can tell it's not from rocket or hyper.

@Razican
Copy link
Contributor

Razican commented Jan 26, 2020

@jebrosen solved. It turns out it was coming from the PostgreSQL database backend. It seems that doing all the tests in parallel, it was creating too many connections at the same time :)

@vultix
Copy link

vultix commented Mar 4, 2020

@jebrosen whats the status on getting the async branch merged? Are there any issues we can help with to push it over the finish line?

@jebrosen
Copy link
Collaborator Author

jebrosen commented Mar 4, 2020

Since my last update comment above, these were also done:

  • async_trait - it works for enough traits we went ahead and used it
  • Switching to AsyncSeek for bodies
  • Miscellaneous cleanups and fixes to code and documentation

I am currently working on the following, in #1242 (and forgot to link it back here):

  • Asynchronous on_attach fairings. This would require some pretty big API changes.

My next PR after that will likely be a wrapper making it safe(r) to use the currently supported blocking databases.

I will look into any async-specific issues that could use assistance. One that comes to mind now is typed headers - e.g. hyperium/headers#48 (comment), or an investigation into which typed header crate(s) would be suitable for built-in support in Rocket would be helpful.

@jebrosen
Copy link
Collaborator Author

I have posted a shortup writeup on what kind of investigation needs to be done on typed headers, if anyone is interested in helping with that particular effort - #1067 (comment)

@Songtronix
Copy link

Has there been any work regarding logging? I think tracing would fit the bill quite nicely as outlined in #21
Btw very excited for 0.5, awesome work!

@Ameobea
Copy link

Ameobea commented Sep 14, 2020

Hi all,

I've switched over fully to the latest Rocket with full async for a couple of my latest projects, but was running into the fact that compression support doesn't exist yet. This was inflating responses to my users and bringing up the cost of my bill for network egress, so I hacked up a placeholder solution that adds back gzip support: Ameobea@eab53cf

The API is exactly the same as before; you just add features = ["compression"] for rocket-contrib and add the Compression::fairing(). The implementation is not optimal in any way; the compression itself isn't async but it really shouldn't be a big deal for almost all workloads and the huge benefit of having gzip compression outweighs any increase in response time by a huge margin.

I figured this may be useful to other people who wanted to gain the benefits of compression and the latest async Rocket without having to re-write any of their code. Once someone gets around to doing the job properly, it should just be a change to Cargo.toml to fix it.

EDIT 2021-07-17:

I've created a library that performs response compression using the async-compression library under the hood:

https://github.com/Ameobea/rocket_async_compression

@mimoo
Copy link

mimoo commented Feb 10, 2021

Note that I'm getting some issues when using a managed state

async fn dependencies(state: State<App>, repo: String) -> String { }
error[E0726]: implicit elided lifetime not allowed here
  --> src/main.rs:43:30
   |
43 | async fn dependencies(state: State<App>, repo: String) -> String {
   |                              ^^^^^^^^^^ help: indicate the anonymous lifetime: `State<'_, App>`

error[E0621]: explicit lifetime required in the type of `__req`
  --> src/main.rs:43:23
   |
42 | #[get("/dependencies/<repo>")]
   | ------------------------------ help: add explicit lifetime `'static` to the type of `__req`: `&'_b rocket::Request<'static>`
43 | async fn dependencies(state: State<App>, repo: String) -> String {
   |                       ^^^^^ lifetime `'static` required

@tforgione
Copy link

tforgione commented Feb 10, 2021

@mimoo from what I've seen, when using async routes, you have to specify the anonymous lifetime like so

async fn dependencies(state: State<App, '_>, repo: String) -> String { String::new() }

@SergioBenitez
Copy link
Member

@mimoo from what I've seen, when using async routes, you have to specify the anonymous lifetime like so

Note that this is not a Rocket restriction, but rather, a Rust restriction. Also, it's State<'_, T>.

@Type1J
Copy link

Type1J commented Mar 31, 2021

I'm guessing that Responder is still a sync trait in the latest async code. I was hoping to make a request with reqwest in respond_to().

@SergioBenitez
Copy link
Member

I'm guessing that Responder is still a sync trait in the latest async code. I was hoping to make a request with reqwest in respond_to().

This was a conscious decision, not something that's left to do. In fact, Responder was first made async and then reverted to be a regular trait. The rationale is that, with the exception of the body itself being an async stream, anything async can happen in the handler before the Responder is returned.

Specific to your comment, sending HTTP requests in the responder sounds rather odd. A responder generates a response, while this sounds like more processing, which should occur in the handler.

@Type1J
Copy link

Type1J commented Mar 31, 2021

@SergioBenitez You're correct, that it's not the place for it. I was using okapi for OpenAPI, and part of the API is for an upstream server, I have stub handlers with dummy data for now, but the idea is that I want to send all headers and all of the body to the upstream server, then respond with exactly what it said. What the upstream server said will likely be what the return type of the handler was, but I didn't want to deserialize just to reserialize, so I made a "wrapper" responder that implemented the traits for Rocket and okapi, but it generated the response completely. However, when I added in reqwest, I got a panic, and I also found that I didn't have access to the body, so I switched to using a FromData struct that I'll pass in to that responder, so all of the respond_to() logic is sync. I know that it's not ideal, but I need these routes to be fast.

@Type1J
Copy link

Type1J commented Mar 31, 2021

Is there a better way to do forwarding?

@SergioBenitez
Copy link
Member

It sounds like you should be implementing a custom Handler.

@Type1J
Copy link

Type1J commented Mar 31, 2021

@SergioBenitez Thanks! Great framework, by the way!

@DuBistKomisch
Copy link

@Type1J I had this exact issue recently, with some messing around you can get a compatible async stream from the reqwest response body to store in a struct implementing Responder: https://github.com/DuBistKomisch/jakebarnes/blob/ba7171c013bd101e049d8b2af5fc7120b6ea6c1c/src/steam.rs#L93-L99

@Ameobea
Copy link

Ameobea commented Jul 18, 2021

I've implemented async response compression using the async-compression library and created an external library for it:

https://github.com/Ameobea/rocket_async_compression

It currently only works with the git version of Rocket. The internal implementation is extremely simple and I think it's pretty much good to go for including in Rocket itself. Let me know if anyone is interested in making that work.

@mainrs
Copy link

mainrs commented Jul 18, 2021

@Ameobea might be worth making the compression level customizable via the Rocket.toml file once integrated into Rocket.

@SergioBenitez
Copy link
Member

With the new rocket_db_pools, I'm happy to close this issue. The remaining issues tracked in the parent post are not specific to async I/O (typed headers, more docs, and connection APIs) and have independent issues or have no viable solutions outside of rustc (lints for async safety).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests