Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow HTTP2 encoder to split headers across frames #55322

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

ladeak
Copy link
Contributor

@ladeak ladeak commented Apr 23, 2024

Allow the HTTP2 encoder to split headers across frames

Enable Kestrel's HTTP2 to split large HTTP headers into HEADER and CONTINUATION frames.

Description

Kestrel's HTTP2 implementation limits the max header size to the size of the frame size. If a header's size is larger than the frame size, it throws an exception: throw new HPackEncodingException(SR.net_http_hpack_encode_failure);. RFC 7540 allows the headers to be split into a HEADER frame and CONTINUATION frames.
Before the change Kestrel only used CONTINUATION frames when headers fitted fully within a frame.

This PR changes the above behavior by allowing to split even a single HTTP header into multiple frames. It uses an ArrayPool<byte>, to back the buffer used by the HPack encoder.

When the HPack encoder reports that a single header does not fit the available buffer, the size of the buffer is increased. Note, that the .NET runtime implementation on HttpClient writes all headers to a single buffer before pushing it onto the output, contrary to this implementation that keeps the semantics of Kestrel. It only increases the buffer when a single header fails to be written to the output, otherwise the old behavior is kept. My intention was to keep this behavior so that memory-wise it does not use more memory than the single largest header or the max frame size.
With this PR HPackHeaderWriter uses an enum to tell Http2FrameWriter to increase the buffer or not. When the buffer is too small, its size is doubled.

This behavior is also implemented for trailers. Note that in case of headers, the HEADER frame is never empty because of the response status, while this is not true for trailers. Hence there is a subtle difference when getting the buffer for the initial frame of a header vs. a trailer.

I updated existing tests asserting the previous behavior and added new tests to validate the proposed changes.

Performance

Performance-wise the change is not expected to increase throughput (given it must do more to enable this use-case) but the goal is to have the 'slow-path' is only in the case when a single header is too large. I used the existing Http2FrameWriterBenchmark to compare performance before and after:

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.22631
12th Gen Intel Core i7-1255U, 1 CPU, 12 logical and 10 physical cores
.NET SDK=9.0.100-preview.4.24218.26
  [Host]     : .NET 9.0.0 (9.0.24.21901), X64 RyuJIT
  Job-KLBPPQ : .NET 9.0.0 (9.0.24.21807), X64 RyuJIT

Before changes (updated 2024. 05. 22. main):

Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 186.6 ns 1.32 ns 1.17 ns 5,357,686.3 0.0002 - - 32 B
Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 185.9 ns 1.35 ns 1.27 ns 5,377,844.8 0.0002 - - 32 B

After changes using arraypool instead of arraybufferwriter

Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 182.3 ns 1.01 ns 0.84 ns 5,485,361.4 0.0002 - - 32 B
Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 186.9 ns 1.38 ns 1.29 ns 5,351,850.0 0.0002 - - 32 B

Fixes #4722

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Apr 23, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 23, 2024
@ladeak ladeak marked this pull request as draft April 23, 2024 20:39
@ladeak
Copy link
Contributor Author

ladeak commented Apr 23, 2024

cc. @amcasey who has been in the discussions of the corresponding issue.

@amcasey
Copy link
Member

amcasey commented Apr 24, 2024

Thanks for this! Thorough, as always.

Some notes:

  1. The CI failure is unrelated. No point in re-running, since we're expecting revisions.
  2. It may be a little while before I have time to review this in detail.
  3. The description mentions doubling the size of a buffer, as needed - I assume there's a cap on the size of that buffer so it can't grow indefinitely and DoS the server?
  4. The description (and tests) only mention spreading a header into a single CONTINUATION. I'm assuming this will work if the header is, e.g. 33 KB?
  5. Is there a place we could use a simple flag to disable the new behavior and revert to the old behavior? e.g. if I wanted a NeverSplitHeaders appcontext switch, would that be straightforward to add or would be need to duplicate a bunch of code.

Some test questions (you may already have code/coverage for this - I haven't checked):

  1. What happens if the header name is well-known and gets compressed? Is the 16KB limit based on the compressed or uncompressed size?
  2. What happens if the header name itself is more than 16KB? (Throwing might be appropriate, but we should at least know.)
  3. What happens if the header is larger than the maximum total header size? (I'm pretty sure there's an Http2ServerLimit for this.)
  4. I noticed that HPACK had some handling for never-compressed literals. Does this work for those?
  5. If there's a very short header before the very long header (e.g. one that compresses to one or two bytes), is that first HEADERS frame tiny?

@ladeak
Copy link
Contributor Author

ladeak commented Apr 24, 2024

@amcasey let me reply inline:

  1. The description mentions doubling the size of a buffer, as needed - I assume there's a cap on the size of that buffer so it can't grow indefinitely and DoS the server?

You are wondering about headers that are roundtripped to the client with large size (previously would fail, now it could DDoS)? I need to check Kestrel's H2 header size limits (you also mention), but there is nothing in the Http2FrameWriter in this regard.

  1. The description (and tests) only mention spreading a header into a single CONTINUATION. I'm assuming this will work if the header is, e.g. 33 KB?

It can span into zero or more CONTINUATION frames.

  1. Is there a place we could use a simple flag to disable the new behavior and revert to the old behavior? e.g. if I wanted a NeverSplitHeaders appcontext switch, would that be straightforward to add or would be need to duplicate a bunch of code.

There is no such place, but it could be very well built along the Kestrel's limit or an AppContext switch. Please let me know if building such a would be preference. But note, that previously "possible" use-cases still work the same as before, so the switch would only control if large headers are allowed or not -> hence a limit might be suitable option.

  1. What happens if the header name is well-known and gets compressed? Is the 16KB limit based on the compressed or uncompressed size?

I did not come across compression/no-compression on this path. HPack encodes the header values into this buffer.

  1. What happens if the header name itself is more than 16KB? (Throwing might be appropriate, but we should at least know.)

The header is written to a buffer, which is split into CONTINUATION frames, so it does not matter if the name or the value is being oversized.

  1. What happens if the header is larger than the maximum total header size? (I'm pretty sure there's an Http2ServerLimit for this.)

MaxRequestHeaderFieldSize ? -> I need to test the behavior.

  1. I noticed that HPACK had some handling for never-compressed literals. Does this work for those?

It works on anything that HPack writes to my understanding. I will double confirm.

  1. If there's a very short header before the very long header (e.g. one that compresses to one or two bytes), is that first HEADERS frame tiny?

-> If the long one does not fit in the same frame, yes, the initial header will be sent in a tiny frame. This is even true for the "current" behavior.

@JamesNK
Copy link
Member

JamesNK commented Apr 24, 2024

Background: I've looked at HTTP2 headers a lot. I wrote some of the dynamic support and rewrote the writer and parser at one point.

I haven't looked through the code in detail yet. Some initial thoughts:

  • Well done for jumping into HTTP2. It's complex. And Kestrel's implementation is complex.
  • I like the feature. Kestrel's header limit on frame size has always felt arbitrary. We did it because it was easy and fast, not because it was the best thing to do. I recently wrote a feature that put a lot of content into a header and was mindful of this limitation.
  • I know Kestrel has various limits for request headers. I don't believe there are limits on response headers (other than the max frame size). The idea is an app is responsible for sending the response headers so no need to limit it. However, we should look to see what HTTP/1.1 does in this area. If it doesn't limit what a response can do with headers, then neither should HTTP/2.
  • But we should stress this and put some kind of upper limit. What happens if someone wants a 20-megabyte response header? Or a 2-gigabyte response header? Something should blow up before the server tries to double a gigabyte buffer and blows up from an excessive amount of data.
  • Reading and writing response headers is extremely performance critical. Because it's stateful (the dynamic table is on the connection) it basically locks everything. Must reduce any performance overhead.

@amcasey
Copy link
Member

amcasey commented Apr 24, 2024

@ladeak Thanks! Your responses make sense.

Regarding the appcontext switch, I regard this as a fairly risky change because it's touching such critical code (not because of anything you've done - just the circumstances). It would be good to at least think about how to make it possible to revert (as exactly as possible) to the old behavior. It may turn out to require too much duplicate code or API changes or something that makes it unacceptable, but I think we need to at least know what it would take. Open to differing opinions from @mgravell or @JamesNK.

@JamesNK
Copy link
Member

JamesNK commented Apr 25, 2024

I don't think we need a switch. If this lands in a mid .NET 9 preview, then it should go through a lot of use and testing before GA.

For example, the work David did to rewrite response writing in .NET 7(?) was much more complex and we didn't have a fallback.

@amcasey
Copy link
Member

amcasey commented Apr 25, 2024

I don't think we need a switch. If this lands in a mid .NET 9 preview, then it should go through a lot of use and testing before GA.

For example, the work David did to rewrite response writing in .NET 7(?) was much more complex and we didn't have a fallback.

Good enough for me. I hadn't considered how well exercised this code is by TechEmpower, etc.

@amcasey
Copy link
Member

amcasey commented Apr 26, 2024

@ladeak Did you receive the initial feedback you needed? Is this ready for a full review or are you still working on it. There's no rush - I just wondered whether the next steps were our responsibility. Thanks!

@ladeak
Copy link
Contributor Author

ladeak commented Apr 26, 2024

@amcasey Going to come back tomorrow with some findings.

@ladeak
Copy link
Contributor Author

ladeak commented Apr 27, 2024

Discussion about the header size limits: as I understood there is a general desire to have a limit. However, response headers mostly depend on the app, and the way app handles headers. I have not found limits for HTTP/1.1. An empty/default Kestrel ASP.NET Core webapp also allows as large headers as desired with h1. On the consumer-side I ran into limits though (H1):

.NET HttpClient has MaxResponseHeadersLength with the default of 65536 bytes.
curl on Linux - curl: (27) Rejected 140725 bytes header (max is 102400)!
curl on Windows same as on Linux - curl: (27) Out of memory
Chromium - Edge: returns ERR_RESPONSE_HEADERS_TOO_BIG at ~262000 bytes.

When I run the app with IISExpress, it seems to only returns the remainder of 64k. (header mod 65536).

@amcasey , the following questions would need further clarification:

  • hardcode a "big enough" limit or to expose this on Kestrel options under the Http2 limits?
  • what default value should this limit have?

juliobacoli

This comment was marked as spam.

@amcasey
Copy link
Member

amcasey commented Apr 29, 2024

hardcode a "big enough" limit or to expose this on Kestrel options under the Http2 limits?
what default value should this limit have?

Since the spec doesn't give a limit, I think we need to give users a way to override whatever we decide. I'm not sure why it would be specific to http/2 though - presumably the same concern applies to http/1.1 or http/3.

My first thought for a default would be double MaxRequestHeadersTotalSize (i.e. 64 BK), but @JamesNK makes the sensible point that it's really at the server's discretion and we really just need a cap that prevents things from getting out of hand - maybe 1 MB?

If we were to decide this shouldn't get a public API, I'd want to go even higher - maybe 10 MB.

Thoughts, @JamesNK @mgravell?

@ladeak
Copy link
Contributor Author

ladeak commented Apr 29, 2024

@amcasey I think I have not thought about http/1.1 about a limit, given it does not have currently, and setting 64KB would be breaking, wouldn't it? (I am not sure how difficult it could be to implement this for http/1.1, http/3 looks similar to h2 in code structure. But it makes sense from the point of view you describe that it could be a setting that applies to all versions of http.

A question if it is public: should it apply to a single header or to the total headers. Consumers HttpClient and Edge had a total while curl per header limit.
If it is total headers, does it include trailers?

@amcasey
Copy link
Member

amcasey commented Apr 29, 2024

A question if it is public: should it apply to a single header or to the total headers. Consumers HttpClient and Edge had a total while curl per header limit.
If it is total headers, does it include trailers?

Because we're guarding against resource utilization rather than malformed responses, I think the limit should apply to the total size of all headers, rather than to the size of any individual header. Similarly, if we're reducing the whole detection mechanism to a single number, I would expect trailers to be included. I'm open to feedback on both.

I think I have not thought about http/1.1 about a limit, given it does not have currently, and setting 64KB would be breaking, wouldn't it?

Yes, it would. I think we generally accept breaks in service of DoS prevention, but I agree that this is a strong argument for choosing a default that is larger than we expect anyone to use in practice.

If we felt really strongly about this, I could live with adding limits to both http/2 and http/3 and not to KestrelServerLimits directly. I just don't want to end up in a state where we have three identical properties (or more, when http/4 comes out).

@ladeak
Copy link
Contributor Author

ladeak commented May 1, 2024

@amcasey , I added a commit that has a new limit on KestrelServerLimits, and this is also respected by Http2FrameWriter.
I think the code ended up quite a bit complicated (enforcing the limit for written headers + headers not yet written but requiring a larger buffer, etc.).

One thing I found on the way: SETTINGS_MAX_HEADER_LIST_SIZE is an advisory setting, now with the limit might worth also respecting it. Not all clients send it (for example, HttpClient does not) - unfortunately it is not sufficient against DoS.

I would expect similar implementation would be needed on H/1.1 and H/3 (because the public setting is on Kestrel level), hence not sure if a public limit is worth all the complexity to be added.

Maybe a setting called MaxHeaderBufferSize would simplify things a lot, but I could also see why you would not want to expose such an implementation detail related setting on a public API. So that leads my thought process back to your initial suggestion: an appcontext switch for MaxHeaderBufferSize - although not sure if numbers are supported there.

@amcasey
Copy link
Member

amcasey commented May 16, 2024

But it broke HttpClient , so the client side:

I'm okay with breaking HttpClient - a server doing this for real could be working with a custom client that accepts giant headers. Thanks for confirming!

@ladeak ladeak requested a review from amcasey May 16, 2024 20:08
@amcasey
Copy link
Member

amcasey commented May 16, 2024

@ladeak I saw your update, but I don't think I'm going to have a chance to look at it today.

Copy link
Member

@amcasey amcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not feeling all that well (and might be out at the beginning of next week), so I might not be thinking very clearly, but I think I need an overview of the relationships among WriteResponseHeadersUnsynchronized, WriteDataAndTrailersAsync, SplitHeaderFramesToOutput, and FinishWritingHeadersUnsynchronized.

It looks like all calls to FinishWritingHeadersUnsynchronized are immediately preceded by calls to SplitHeaderFramesToOutput, but it also calls SplitHeaderFramesToOutput? Maybe that call should be moved into FinishWritingHeadersUnsynchronized?

Why do both WriteDataAndTrailersAsync and FinishWritingHeadersUnsynchronized have loops that grow the buffer even though one calls the other? Naively, it feels like exactly one method should be responsible for looping and others should delegate to it as necessary. Maybe I'm missing some functional requirement or perf benefit.

This change feels close - looking forward to catching up next week.

{
// More headers sent in CONTINUATION frames.
_headerEncodingBuffer.Advance(payloadLength);
SplitHeaderFramesToOutput(streamId, endOfHeaders: false, isFramePrepared: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frame that has been prepared is the initial HEADERS and not the first CONTINUATION, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in this branch BeginEncodeHeaders returns MoreHeaders, so it means we can write the current buffer as is to the output and continue writing the rest of the headers with FinishWritingHeadersUnsynchronized.

This branch could also write the frame directly as I don't think in this case we ever get a buffer larger to a single frame (so we could drop SplitHeaderFramesToOutput and write the frame directly as in the Done case but without the END_HEADERS flag). Note that this is not true when writing the TRAILERS, because that case might return BufferTooSmall already for the initial write to the buffer frame. - I am happy to change this branch as I propose here, if you think it worth it.

In this case the frame is prepared as a HEADERS frame in line 507, it is sent and further CONTINUATION frames will be used by FinishWritingHeadersUnsynchronized. I think the comment here might be misleading, I will clarify that.

@ladeak
Copy link
Contributor Author

ladeak commented May 18, 2024

@amcasey I wish you a smooth recovery.

I'm not feeling all that well (and might be out at the beginning of next week), so I might not be thinking very clearly, but I think I need an overview of the relationships among WriteResponseHeadersUnsynchronized, WriteDataAndTrailersAsync, SplitHeaderFramesToOutput, and FinishWritingHeadersUnsynchronized.

It looks like all calls to FinishWritingHeadersUnsynchronized are immediately preceded by calls to SplitHeaderFramesToOutput, but it also calls SplitHeaderFramesToOutput? Maybe that call should be moved into FinishWritingHeadersUnsynchronized?

Yes, I actually used to have an additional callsite for SplitHeaderFramesToOutput in FinishWritingHeadersUnsynchronized. The initial implementation called an initial split before entering the loop.

I removed this initial call to SplitHeaderFramesToOutput so FinishWritingHeadersUnsynchronized had less branching and parameters needed. When writing the TRAILERS the SplitHeaderFramesToOutput is not preceeded by FinishWritingHeadersUnsynchronized, so FinishWritingHeadersUnsynchronized method was requiring the branching operation on endOfHeaders.
I also felt it is a bit cleaner to pass the isFramePrepared explicitly from the same method where the frame is actually prepared rather than implicitly figuring it out in FinishWritingHeadersUnsynchronized.

But of course, I am happy to refactor 'back' as suggested.

Why do both WriteDataAndTrailersAsync and FinishWritingHeadersUnsynchronized have loops that grow the buffer even though one calls the other? Naively, it feels like exactly one method should be responsible for looping and others should delegate to it as necessary. Maybe I'm missing some functional requirement or perf benefit.

WriteDataAndTrailersAsync calls BeginEncodeHeaders and loops only for the first buffer, and FinishWritingHeadersUnsynchronized write CONTINUATIONS only.
The WriteDataAndTrailersAsync requires the loop, so if we have a single large trailer, BeginEncodeHeaders will return BufferTooSmall and calling FinishWritingHeadersUnsynchronized could result an empty HEADERS, CONTINUATION, CONTINUATION, etc. frames, because FinishWritingHeadersUnsynchronized does not have this notion of different frame types.

Note, when writing the response headers the loop could be dropped because there is a status code, so we never end up with empty HEADERS frame.

I am looking at the difference between HPackHeaderWriter.BeginEncodeHeaders (the overload without the status code) and HPackHeaderWriter.ContinueEncodeHeaders, and I wonder if it can be refactored in a way to accommodate the above question - not sure if it is a good path to go down though..

@amcasey
Copy link
Member

amcasey commented May 20, 2024

@ladeak I sketched out the code organization I had in mind. It's probably wrong - I didn't even try to compile it - but I think it illustrates my two goals: making headers and trailers look similar and only manipulating the buffer size in a single method. It also restores the fast path in the trailers case - I think that may have been lost.

What do you think? I wrote it as a way to clarify my questions and not as a proposal.

Same caveat as before: I'm not feeling very well, so it may not make sense at all. 😆

@ladeak
Copy link
Contributor Author

ladeak commented May 21, 2024

@amcasey thank you, I see your idea now. With an initial read I see two edge cases:

  • the 'Done' case when writing the HEADER frame might not enjoy the 'fast path', because the buffer could be already set to a larger capacity than the frame size when writing the response header. Seems a simple thing to add. (I think the same is true for the 'more headers'. I would avoid allocating new buffers because the original Http2FrameWriter also only had a single allocation.
  • there was a reason I had to re-initialize the '_headersEnumerator' for the TRAILERS. After checking the code I think because BeginEncodeHeaders calls MoveNext. I pull the sketch later today and try to take it further.

I wish you getting better. Please note, that I also have an activity for the next 5 weeks 3 days a week that slightly reduces my normal capacity (it started 2 weeks ago).

@amcasey
Copy link
Member

amcasey commented May 21, 2024

the 'Done' case when writing the HEADER frame might not enjoy the 'fast path', because the buffer could be already set to a larger capacity than the frame size when writing the response header

Probably not the only oversight. 😆

I would avoid allocating new buffers because the original Http2FrameWriter also only had a single allocation

Fair point, but I'm a little concerned about the buffer never shrinking (AFAICT).

there was a reason I had to re-initialize the '_headersEnumerator' for the TRAILERS

I'm interested to know what you find.

I wish you getting better

Thanks!

reduces my normal capacity

No worries. We're very grateful for the work you've already put in - do what works for you.

@ladeak
Copy link
Contributor Author

ladeak commented May 21, 2024

@amcasey I made the 'Sketch' working and passing tests on my machine. I push changes with the latest commit.
I could 'only' make it work by adding a new HPackHeaderWriter.RetryBeginEncodeHeaders method. The other thing I notice, is that the 'Fast' path is very slightly slower, as it needs to call Advance method and read the WrittenSpan property.

I would add maybe 1-2 additional tests to be more comfortable with this implementation though. (ie. for the new RetryBeginEncodeHeaders).

I have not yet handled the fact that _headerEncodingBuffer grows within a connection. My initial idea was to address it by recreating a new one after the trailers are written and if the size is larger to the _maxFrameSize.
Then it might work only for certain headers. ie. I have a large Cookie (when I have trouble with large headers, it is always a cookie), then that cookie would be large on all streams of the connection.
I would like to avoid allocating unnecessary though. I will come back to this tomorrow and continue.

EDIT: I could not use header's Reset because it does not reset the unKnownEnumerator's position.

EDIT2: I realize that ArrayBufferWriter is anyway a less optimal choice, because it does a copy upon growing the underlying buffer. Instead, I will restore the original array as before and use an ArrayBufferPool for headers that need a large buffer.

@ladeak
Copy link
Contributor Author

ladeak commented May 22, 2024

I have updated the Benchmarks, and I noticed a significant degradation on the main branch. My measurements from earlier this month vs now:

Before changes (updated 2024. 05. 04. main branch):

Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 79.64 ns 0.581 ns 0.515 ns 12,556,569.9 0.0002 - - 32 B
Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 77.43 ns 0.355 ns 0.315 ns 12,914,602.3 0.0002 - - 32 B

Before changes (updated 2024. 05. 22. main branch):

Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 186.6 ns 1.32 ns 1.17 ns 5,357,686.3 0.0002 - - 32 B
Method Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
WriteResponseHeaders 185.9 ns 1.35 ns 1.27 ns 5,377,844.8 0.0002 - - 32 B

@amcasey
Copy link
Member

amcasey commented May 23, 2024

I have updated the Benchmarks, and I noticed a significant degradation on the main branch. My measurements from earlier this month vs now:

Nicer code definitely doesn't justify that kind of perf hit, but I'd be interested to know what's causing it. Maybe the slow path just needs to be in a separate helper method to make the JIT happy?

@amcasey
Copy link
Member

amcasey commented May 24, 2024

I could 'only' make it work by adding a new HPackHeaderWriter.RetryBeginEncodeHeaders method

I'm not sure I understand what's going wrong here. It sounds like maybe WriteDataAndTrailersAsync needs to reset or re-initialize _headersEnumerator if it sees BufferTooSmall?

EDIT: I could not use header's Reset because it does not reset the unKnownEnumerator's position.

I think this is saying that resetting the enumerator would have been an alternative to introducing RetryBeginEncodeHeaders, but it doesn't work with the unknown-enumerator test hook? Can we either make it work or just accept that that implementation throws? (Or maybe I've misunderstood entirely.)

EDIT2: I realize that ArrayBufferWriter is anyway a less optimal choice, because it does a copy upon growing

I like the pool approach. It seems like it also prevents the frame writer from holding onto a huge buffer for the lifetime of the connection (though I agree that that it's likely to want a big buffer for the next request).

Your changes make sense to me but I didn't read line-by-line because it seems like you're still investigating things (esp perf?). Let me know if there's something specific you want me to look at. Thanks again!

@ladeak
Copy link
Contributor Author

ladeak commented May 24, 2024

For trailer, even the first write might end up with BufferToSmall when there is only a single large trailer value.
This means that when we repeatedly use BeginEncodeHeaders. But BeginEncodeHeaders moves the header enumerator's position, so that needs to be somehow reset. There are three ways I could think of resetting the enumerator to the beginning (before the first item):

  • Reinitialize the enumerator with the headers collection. This works, but needs access to the headers. Passing the trailers collection to FinishWritingHeadersUnsynchronized for this seemed wrong.
  • Using _headersEnumerator.Reset, but it only resets the 'known' headers not the unknown collection. Is this bug? Not sure but applies to trailers and headers both.
  • Or adding a RetryBeginEncodeHeaders which does the same as BeginEncodeHeaders but without moving the enumerator - I chose this option.

I am currently not investigating why the main branch shows a performance degradation, I was hoping it is already known - if not, I wanted to bring some light on it. There was only a single change in the file, but that change does not seem to be related.

}
if (writeResult == HPackHeaderWriter.HeaderWriteResult.Done)
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One consideration could be to only return largeHeaderBuffer to the pool in case the method returns here.
That way it could be continued to be used by the code below (L640 onwards), but I found the current implementation easier to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions community-contribution Indicates that the PR has been added by a community member pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow the encoder to split headers across frames
7 participants