Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient mapping to Unix-style APIs with pollable readers and writers #1265

Open
reillyeon opened this issue Apr 5, 2023 · 2 comments
Open

Comments

@reillyeon
Copy link

When cross-compiling code written for native platforms to WebAssembly developers need to implement the poll, read and write system calls on top of a ReadableStream or WritableStream. If we make the simplifying assumption that only file descriptors opened in non-blocking mode need to be supported then we need a way to register a callback for when a stream is readable or writable and to issue a read or write which will only do as much work as can be completed synchronously.

While this behavior can mostly be polyfilled (for example, by calling read() when a stream is polled and marking it readable when the read completes) I have not figured out a way to preserve the copy-minimizing benefits of BYOB readers because there's no way to guarantee that data is written to the BYOB buffer synchronously and so an intermediate buffer is required.

This could be resolved by adding two new options to the ReadableStreamBYOBReader interface (and similar changes for writable streams):

  • a new peek() method which returns a Promise that resolves when there is data in the internal queue or the underlying source has indicated that there is data available.
  • a sync: true option to read() which will cause it to return a ReadableStreamReadResult rather than a Promise<ReadableStreamReadResult> if there was data in the internal queue or the underlying source responded to pull() by enqueuing a buffer synchronously. Otherwise it would return null or some other sentinel value.

Synchronous reads probably have to be signaled to the underlying source as well because changing the value of byobRequest midway through a read() operation would likely cause problems.

@ricea
Copy link
Collaborator

ricea commented Apr 5, 2023

I would hope that native code would use an abstraction layer on top of poll, read and write which would make it easier to adjust the semantics. That's what Chromium's network code does.

Maybe we can just say "if you don't use an abstraction layer, you're going to get extra copies and you'll just have to accept it"?

  • a new peek() method which returns a Promise that resolves when there is data in the internal queue or the underlying source has indicated that there is data available.

Sketch: we add an optional peek() method to the underlying source. It is called when reader.peek() is called with nothing in the internal queue. The underlying source returns a promise which resolves when a synchronous read is expected to succeed. If the underlying source does not implement peek() then the streams machinery falls back to calling pull().

Bikeshed: "peek" usually implies you get to look at the data before reading it. That would be a lot of trouble. Maybe instead have a getter called pending or something, so you could do await reader.pending?

a sync: true option to read() which will cause it to return a ReadableStreamReadResult rather than a Promise if there was data in the internal queue or the underlying source responded to pull() by enqueuing a buffer synchronously. Otherwise it would return null or some other sentinel value.

If we use the same method name, then we have to set the return type to any which means we don't get the nice WebIDL magic for handling promises.

I would prefer to have a distinct method which does this, maybe readSync() or readNow(). It should probably throw a specific exception if there isn't any data available synchronously.

@reillyeon
Copy link
Author

I would hope that native code would use an abstraction layer on top of poll, read and write which would make it easier to adjust the semantics. That's what Chromium's network code does.

Maybe we can just say "if you don't use an abstraction layer, you're going to get extra copies and you'll just have to accept it"?

To me that doesn't seem realistic because it means applications will need to write a custom I/O backend for their WASM port, since the native POSIX APIs don't work this way and Emscripten mimics a POSIX platform.

We should probably reach out to WASI folks to understand what their plans are for defining I/O system calls in WASI 2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants