[RFC] Add async/await proposal #406

fabianfett · 2021-08-16T08:35:33Z

In this pr we outline our vision for an async/await API. We would love to hear your thoughts.

fabianfett · 2021-08-16T08:36:42Z

cc @toddvarland @ktoso @tomerd @tachyonics @0xTim @gwynne @adam-fowler

fabianfett · 2021-08-16T08:38:23Z

adam-fowler · 2021-08-16T09:29:12Z

In general, I'm quite happy with this. It's a good idea to use AsyncSequence as the stream primitive and that you don't plan to wrap it in your own AHC type. This allows for much more user flexibility in how user can provide the data to AHC. I had planned to do the same for Soto. It makes my life easier now that AHC is doing the heavy lifting of reading from AsyncSequence.

The only thing I think is missing is AsyncResponse could do with a helper function to collate all the response body buffers into one. I know this is only a couple of lines of code but this would be such a common piece of functionality that I think it is required.

There are still questions to be asked about task locals vs passing the Logger about and also the tier system but I guess this document isn't about those.

dnadoba · 2021-08-16T10:19:36Z

docs/async-await.md

+
+  static func stream<S: AsyncSequence>(_ sequence: S) -> Body where S.Element == ByteBuffer
+
+  static func stream<S: AsyncSequence>(_ sequence: S) -> Body where S.Element == UInt8


How do you imagine to store the sequence in the internal enum? I think we would probably need to erase it with some thing like AnyAsyncSequence, similar to what AnySequence does for Sequence. But as discussed in this thread, there is no AnyAsyncSequence in the standard library and AnySequence is already slow. AnyAsyncSequence would probably be even slower. I therefore think we can't get reasonable performance in the case where the element is of type UInt8 because of the overhead introduced by erasing the type of the sequence. They only way I can currently think of making this fast would require making AsyncRequest generic over the actual Body type, which has it own downsides.

The way we make it fast in case of AsyncSequence where Element == UInt8 is by having an inlinable transformer from UInt8 to buffer. That is, a wrapper sequence that builds up buffers of, say, 1kB in size (or whatever) before it passes it on. We then wrap that sequence in the body.

So long as the transformer is inlinable we can have it specialized by the caller. We can then wrap it in our type erasing logic (simply a callback of type () async throws -> ByteBuffer?) and voila, we get pretty good straight line performance. It's not perfect, but it should be pretty good.

A proof of concept has been implemented here... https://github.com/fabianfett/async-http-client/blob/ff-async/Sources/AsyncHTTPClient/AsyncAwait/AsyncRequest.swift#L74-L87

In the static factory methods we translate the AsyncSequence into a callback, that makes use of the specialization.

static func stream<S: AsyncSequence>(_ sequence: S) -> Body where S.Element == UInt8 { var iterator = sequence.makeAsyncIterator() let body = self.init(.asyncSequence { allocator -> IOData? in var buffer = allocator.buffer(capacity: 1024) // TODO: Magic number while buffer.writableBytes > 0, let byte = try await iterator.next() { buffer.writeInteger(byte) } if buffer.readableBytes > 0 { return .byteBuffer(buffer) } return nil }) return body }

The way we make it fast in case of AsyncSequence where Element == UInt8 is by having an inlinable transformer from UInt8 to buffer. That is, a wrapper sequence that builds up buffers of, say, 1kB in size (or whatever) before it passes it on. We then wrap that sequence in the body.

Yeah that's what any stream (IO related at least) will have to do and I've mentioned this repeatedly to folks working on IO and streams. The problem with our async story is that it's not clear who and where must do this chunking. So we may double-chunk un-necessarily... Say the "async bytes" abstraction https://developer.apple.com/documentation/foundation/filehandle/3766681-bytes does exactly that. So feeding a request from a file by doing .bytes on it and putting it into this request body will work but we may end up with two such buffers that aggregate where we might have wanted one... but I digress, yes such buffers are exactly what we need to do at asynchronous boundaries of streams (such as here, where we get passed one).

We don't have a choice, we have to do it. The reason we have to do it is that we have to type-erase the sequence, and the moment we do that the compiler can't see through the loop. This means we can only iterate on the chunks. The better move, I think, would be for .bytes to be augmented by a .chunkedBytes API that returned some appropriate buffer type. This would allow adopters building pipelines to delay chunking until the last moment.

Of course, now we get into an argument about what that buffer type is. 😉

Yeah agreed that we have to do it here. Interesting idea with offering a chunkedBytes to allow more control to users there, might be worth a radar :-)

fabianfett · 2021-08-16T12:01:29Z

@adam-fowler

The only thing I think is missing is AsyncResponse could do with a helper function to collate all the response body buffers into one. I know this is only a couple of lines of code but this would be such a common piece of functionality that I think it is required.

apple/swift-nio#1939

docs/async-await.md

ktoso · 2021-08-16T08:51:18Z

docs/async-await.md

+```
+
+- **Why do we have a deadline in the function signature?** 
+    Task deadlines are not part of the Swift 5.5 release. However we think that they are an important tool to not overload the http client accidentally. For this reason we will not default them.


While we can't commit to making them happen for Swift 6, it's a clear and important thing we want to support in swift concurrency. So I agree not doing another "our own" thing until then is the right call here.

docs/async-await.md

ktoso · 2021-08-16T11:36:12Z

docs/async-await.md

+- [refactor to make request non-throwing](https://github.com/swift-server/async-http-client/pull/56)
+- [improve request validation](https://github.com/swift-server/async-http-client/pull/67)
+
+The url validation will become part of the normal request validation that occurs when the request is scheduled on the `HTTPClient`. If the user supplies a request with an invalid url, the http client, will reject the request.


I guess I like to validate earlier rather than later so personally would mildly push back against this change...

but since there's been enough issues about it that I guess it's fair to follow up and change... ok then 👍

If you want the URL validation to be part of request creation, you could create a wrapper which initializes a request from a URL object but creates the request using a String. That should give you a reasonable guarantee that HTTPClient will not reject the request due to it being an invalid URL.

It would mean parsing the URL twice, but that’s relatively cheap, and there are potentially ways to avoid it if we get shared Strings in the standard library at some point (they are already part of the ABI). If the shared String is owned by a URL object, the URL parser can assume it is already a valid URL and recover the parsed URL object from the owner without parsing the String again or allocating fresh storage.

docs/async-await.md

ktoso · 2021-08-16T12:27:32Z

docs/async-await.md

+  /// the response headers
+  public var headers: HTTPHeaders
+  /// the response payload as an AsyncSequence
+  public var body: Body


It might be worth exploring an akka http entity inspired design here before we commit to this API.
Specially for the response object it matters because as we create it we know if there will be more bytes or not, so we can immediately construct the right "entity object", e.g. Empty or Strict() or a stream.

Representing the body as an abstract HTTPEntity which may be "strict" or not allows us then to create store the strict bytes via reference to it directly, we don't have to use the stream then at all other than lazy creating it when someone pulls from it.

The entire topic of "there's a stream of bytes, but noone subscribed to it" is quite hellish in general.

What is the approach of this implementation about it? If we're doing a proper back-pressured stream it means that not reading it is directly connected to not reading from the wire, which may or may not be what the user intended. In akka we had to invent subscription timeouts to resolve such zombie streams.

In case I'm not clear about the case: response has streamed body; we're a nice streaming API and people expect back-pressure via such stream API. If we never consume this stream, what happens?

(In Akka's case we were very loud that this is user error and one has to do .discardBytes() (but it's annoying))

Note though that I don't yet have good performance intuitions about AsyncSequences... perhaps it is not worth optimizing for knowing that there's a .single(<bytes>) at all since setup and logic costs are comparable between this and an iterator... So mostly sharing as an "are we sure and have we considered alternatives?" note.

In case I'm not clear about the case: response has streamed body; we're a nice streaming API and people expect back-pressure via such stream API. If we never consume this stream, what happens?

(In Akka's case we were very loud that this is user error and one has to do .discardBytes() (but it's annoying))

I'm afraid we will have to do the same here.

We talked offline a little bit -- I think we're in a better place here, because the request's dropping can cancel (and does, but was not documented in this document), which leaves us without the nasty "hanging stream" issue 👍

The must always consume rule is fine, we'll have to be explicit about it 👍

docs/async-await.md

ktoso · 2021-08-16T12:33:13Z

docs/async-await.md

+  var request = AsyncRequest(url: "https://swift.org/")
+  let response = try await httpClient.execute(request, deadline: .now() + .seconds(3))
+
+  var trailers = try await response.trailers // can not move forward since body must be consumed before.


Probably throw if it is known that body is not consumed yet?

I think we should just crash.

I think I'm on the crash side as well. If the user consumes the trailers before the body, it will never be right. Crash seems to be the right tradeoff.

Yeah that's true, unlikely it'll be sometimes correct, and even if it is people really must fix it. 👍

docs/async-await.md

Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>

fabianfett · 2021-08-16T14:58:48Z

The proposal was updated with new naming. HTTPClient.AsyncRequest has been replaced with HTTPClientRequest and HTTPClient.AsyncResponse has been replaced with HTTPClientResponse. We acknowledge that the new naming is very close to the current naming (just dropping a .). However by dropping the dot we will get types that are findable with autocompletion (just type "HTTP").

docs/async-await.md

…ced in HTTPClientRequest

docs/async-await.md

ktoso · 2021-08-17T13:03:55Z

docs/async-await.md

@@ -168,6 +168,40 @@ default:
 }
 ```

+Stream upload and download example using the new `FileHandle` api:


"new" where, in http client?

I think what this refers to is the new async APIs on Foundation's FileHandle type.

See: https://developer.apple.com/documentation/foundation/filehandle/3766681-bytes

glbrntt · 2021-08-17T13:23:05Z

docs/async-await.md

+  /// The request's body
+  var body: Body?
+
+  init(url: String) {


Only allowing url to be set in the init might be a little too restrictive. This pattern makes a lot of sense for large configuration objects etc. where adding a new init for each new piece of configuration quickly becomes a burden for maintainers.

I don't know if that's justified here since this is likely to be used much more frequently so we should make it easy to use even if the maintenance cost is higher. Moreover I suspect there's much less scope for adding new properties to the request.

On the other hand, the "convenience" APIs might cover this just fine.

glbrntt · 2021-08-18T13:39:21Z

docs/async-await.md

+  var method: HTTPMethod
+
+  /// The request's headers
+  var headers: HTTPHeaders


I was wondering whether HTTPHeaders is the right type here. For http/2 connections we'll need to normalise the header names to be lowercased and repackage them into HPACKHeaders which isn't cheap (in gRPC we used to use the http2-to-1 codecs and deal with http/1 types; we saw a pretty big increase in performance from using http2 directly, most of which came from not converting the headers). It'd be a shame to have to incur that cost here as well.

Keeping HTTPHeaders (over HPACKHeaders or a new AHC-provided headers type) is probably the preferred option at the moment since it's more or less a currency type but doesn't feel like an optimal solution.

@ktoso @weissi @PeterAdams-A wdyt?

In gRPC did you not just move the overhead to HTTP/1? Ie, made HTTP/2 the default and now have to convert when on HTTP/1?

Yes, exactly. It's a much clearer cut decision for gRPC since http/2 is much more common; so much so that the cost of doing the extra work for http/1 doesn't really matter.

Presumably at the point where this is currently created we don't know which HTTP version we'll be using? Would we need to do some sort of delayed creation? I think we need to provide and easy version to use, even if we also have a fast option.

That's right, we need the headers before we know what version we're using.

One option is AHC providing its own headers type which could wrap either an HTTPHeaders or HPACKHeaders. This has the advantage of being able to change the backing type without breaking API if we decide to change our bias from http/1 to http/2 or vice versa. Of course this potentially makes things worse by adding a third headers type...

Another would be making interop between HTTPHeaders and HPACKHeaders cheaper -- I'm not sure how possible this is though.

Could make the third way a protocol with default implementations for both such that if we get the right one it's close to free, and wrong one is just the conversion we'd have to do anyway.

stevapple · 2021-09-23T09:13:19Z

docs/async-await.md

+        handle.write(buffer.readData(buffer.readableBytes)!)
+      case .none:
+        handle.close()
+      }


I suppose we can use for await buffer in response.body { … } here?

In general, yes because response.body implements AsyncSequence. However, in this specific example we want to support proper backpressure meaning that we don't want to read more from the body if the writeHandler is not ready to write more data. We therefore wait until we read the next buffer from response.body until writeHandle.writeabilityHandler is called.

stevapple · 2021-09-23T10:11:14Z

docs/async-await.md

+- **How does cancellation work?** Cancellation works by cancelling the surrounding task:
+
+        ```swift
+        let task = Task {


Could there be a convenience API that makes a request a Task? eg.

let task = httpClient.task(with: request) // … let reponse = await task.value

This is a syntactic sugar that allows the HTTP task to be handled by another function. That is, we don’t need to await for the result in the same context.

Swift already has special syntax to create a child task through async let. What you propose would create a detached Task. I think this would probably make it too easy and users would create detached Tasks just because there is a method for it, even if they would actually want a child Task through async let.

Add async/await proposal

7ec050b

dnadoba reviewed Aug 16, 2021

View reviewed changes

fabianfett mentioned this pull request Aug 16, 2021

Add AsyncSequence helpers apple/swift-nio#1939

Draft

ktoso reviewed Aug 16, 2021

View reviewed changes

fabianfett and others added 3 commits August 16, 2021 14:53

PR review

e2f078e

Update docs/async-await.md

f96567c

Co-authored-by: Konrad `ktoso` Malawski <konrad.malawski@project13.pl>

HTTPClientRequest & HTTPClientResponse

110ab5e

fabianfett requested review from Davidde94, glbrntt, PeterAdams-A and tomerd August 17, 2021 08:42

glbrntt reviewed Aug 17, 2021

View reviewed changes

docs/async-await.md Outdated Show resolved Hide resolved

docs/async-await.md Show resolved Hide resolved

docs/async-await.md Show resolved Hide resolved

Added a ful usage example and made clear that request Body is namespa…

992f006

…ced in HTTPClientRequest

ktoso reviewed Aug 17, 2021

View reviewed changes

docs/async-await.md Show resolved Hide resolved

Add FAQ about convenience APIs

22d8b45

fabianfett force-pushed the ff-async-await-proposal branch from 54ec701 to 22d8b45 Compare August 17, 2021 12:00

Added a FileHandle stream example

9687247

ktoso reviewed Aug 17, 2021

View reviewed changes

glbrntt reviewed Aug 17, 2021

View reviewed changes

glbrntt reviewed Aug 18, 2021

View reviewed changes

stevapple reviewed Sep 23, 2021

View reviewed changes

dnadoba mentioned this pull request Nov 15, 2021

simple GET fails with HTTPClientError.cancelled #477

Closed

dnadoba mentioned this pull request Nov 25, 2021

[DRAFT] Working in progress async/await prototype #492

Closed

ktoso mentioned this pull request Aug 16, 2021

[SR-15076] AsyncStream.Continuation.YieldResult is hard to use to implement backpressure apple/swift#57402

Open

fabianfett closed this Nov 15, 2022


		static func stream<S: AsyncSequence>(_ sequence: S) -> Body where S.Element == ByteBuffer

		static func stream<S: AsyncSequence>(_ sequence: S) -> Body where S.Element == UInt8

[RFC] Add async/await proposal #406

[RFC] Add async/await proposal #406

Conversation

fabianfett commented Aug 16, 2021 • edited

fabianfett commented Aug 16, 2021

fabianfett commented Aug 16, 2021

adam-fowler commented Aug 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianfett commented Aug 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianfett Aug 16, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianfett commented Aug 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianfett commented Aug 16, 2021 •

edited

fabianfett Aug 16, 2021 •

edited