Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Make integrations with asyncio projects more friendly #10

Open
standy66 opened this issue Jan 25, 2019 · 7 comments
Open
Labels

Comments

@standy66
Copy link
Member

Right now purerpc uses curio event loop, which is really pleasant to use (thx @dabeaz) but limits interoperability with other asyncio projects. Curio was chosen in 2017 mainly because of this @njsmith's post. As of 2019, some things changed:

  1. Python 3.7 got decent upgrade to asyncio, some problems mentioned in the blog post were fixed, e.g. StreamWriter now has wait_closed() method, there is now a asyncio.get_running_loop() function and other new stuff. Some may say, as asyncio evolves and becomes more async/await-native, there is a possibility to migrate to it, especially given that almost all event loop logic in purerpc is abstracted away in grpc_socket.py. This is one of the paths going forward, and we won't have to think about asyncio interoperability anymore, but there is caveats. Asycnio still lacks curio's wonderful TaskGroups, Thread and Process pools.
  2. curio.bridge.AsyncioLoop may be used to bridge together two worlds: asyncio's and curio's. I don't know whether or not there are performance implications and caveats with this approach, but we get to keep awesome curio functionality. We used this bridge internally with aiopg and it worked fine (cc @penguin138). But we need to make this bridge transparent to the user (maybe some tornado docs may help), so curio and asyncio work together in handlers:
async some_purerpc_handler(...):
  async for request in requests:
    await asyncio.sleep
    await aiohttp_fetch(request.url)
  1. We can look at Nathaniel's trio library which kinda improves on what David has built in curio. There is also trio-asyncio bridge, but I've never heard of anyone using it.
@dabeaz
Copy link

dabeaz commented Jan 25, 2019

Let me throw out an idea I've been toying around with in my mind.... One of the big ideas in Curio was that of keeping the underlying kernel pretty well isolated from everything else. I did some experiments with writing a kernel purely in C (which worked). It might not be too hard to write a Curio kernel that was implemented entirely on top of asyncio. Maybe it's possible to have some kind of better asyncio interop through that kind of approach. The idea is that you would have the same high-level Curio programming API that it has now--it'd just be running natively on top of asyncio. Thoughts?

@standy66
Copy link
Member Author

standy66 commented Jan 25, 2019

Aren't we gonna get async/await-native API on top of plain old callback API? Surely it's just jumps and system calls from assembler perspective, and OS does not care anyway, but I wonder if there might be any hidden buffers (e.g. for the sake of non-blocking transport.write) in the old API that are no longer needed in the new API.

@njsmith
Copy link
Member

njsmith commented Jan 26, 2019

Hey, I hadn't see this before but it's a neat project! Thanks for CC'ing me :-).

@dabeaz

It might not be too hard to write a Curio kernel that was implemented entirely on top of asyncio

You'll probably have some issues with curio.socket, which exposes a way richer API than asyncio does... This is part of why trio-asyncio implements the asyncio on top of trio, instead of vice-versa. In particular, the standard asyncio loop on Windows doesn't even have wait_writable/wait_readable.

@standy66

curio.bridge.AsyncioLoop may be used to bridge together two worlds: asyncio's and curio's. I don't know whether or not there are performance implications and caveats with this approach, but we get to keep awesome curio functionality

One caveat to watch out for is that with curio.bridge.AsyncioLoop the curio and asyncio worlds run in actually different threads, so when you have shared state you have to think in terms of thread-programming and possible race conditions everywhere, instead of async-programming and race conditions only at awaits. This is the main reason why it's useful to get both worlds onto the same underlying loop, whether that's curio-on-top-of-asyncio or asyncio-on-top-of-trio or whatever.

But we need to make this bridge transparent to the user

We've tried to figure out how to do this in trio-asyncio, but it's really hard :-(. The two worlds are pretty different, especially in their cancellation semantics. Also a lot of libraries (like aiohttp) are really fond of doing things like asyncio.current_task().cancel() which is like... totally meaningless in trio, and maybe not quite meaningless in curio, but also not really how curio does things IIRC.

We can look at Nathaniel's trio library which kinda improves on what David has built in curio

👋 If you're interested then we'd certainly be happy to help you figure that out (and I think there are probably several people who would be excited about a grpc implementation for trio!). If you have questions then we have a pretty active chat channel.

Alternatively: You might also want to look at @agronholm's anyio, which provides a mostly-trio-ish API on top of asyncio/curio/trio. There are two major things to be aware of I think, but they probably won't bother you:

  1. like curio (but unlike asyncio/trio), anyio uses await for all operations, even ones that don't wait for anything; I find this awkward and confusing (esp. because it gets hard to tell which operations are atomic or not), but this may be my own idiosyncracy, and if you're coming from curio you won't notice any difference :-).

  2. anyio ignores the high-level networking APIs that asyncio and trio provide, in favor of exposing the socket API directly. This means you're stuck dealing with some of the tricky portability issues that these libs would normally handle for you, but curio emphasizes the plain BSD socket API anyway, so this shouldn't be a surprise for you :-). (In particular, anyio has some problems with Windows support because of this.)

Also, using anyio does risk that you might be a bit foreign-feeling everywhere; e.g. Trio has a standard interface for streaming messages, and Trio users will probably expect a grpc API to use this for streaming RPCs? But this is probably not the biggest issue on your plate right now...

@standy66
Copy link
Member Author

standy66 commented Feb 1, 2019

@njsmith thanks for the thorough reply!

I've decided to try @agronholm's anyio library and to my surprise migration went almost seamless. My work is in anyio branch and I'll merge #13 once I get CI to run tests for all backends automatically (they can be run manually and already pass though). So I guess you can add PureRPC to the list of libraries that support Trio (and curio and asyncio, of course)!

P.S. I've encountered a bug in anyio's sendall implementation that causes data corruption. The workaround right now is to monkeypatch it to match Dave's implementation in curio. But anyone feel free to send it upstream.

@agronholm
Copy link

@standy66 I only read this after I fixed the bug. If you are in a hurry for a new release, I can do that. Otherwise I'll first have a go at fixing agronholm/anyio#37.

@standy66
Copy link
Member Author

standy66 commented Feb 2, 2019

@agronholm Nah, I think I'll be fine if the current workaround stayed for some time. After all, I am already monkey patching some stuff in h2 state machine so it behaves itself correctly when communicating with gRPC Core peers.

@standy66
Copy link
Member Author

standy66 commented Feb 3, 2019

Merged #13
Going to leave this open for now for further discussion and proposals

@standy66 standy66 removed their assignment Feb 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants