-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow HTTP2 encoder to split headers across frames #55322
base: main
Are you sure you want to change the base?
Conversation
cc. @amcasey who has been in the discussions of the corresponding issue. |
Thanks for this! Thorough, as always. Some notes:
Some test questions (you may already have code/coverage for this - I haven't checked):
|
@amcasey let me reply inline:
You are wondering about headers that are roundtripped to the client with large size (previously would fail, now it could DDoS)? I need to check Kestrel's H2 header size limits (you also mention), but there is nothing in the Http2FrameWriter in this regard.
It can span into zero or more CONTINUATION frames.
There is no such place, but it could be very well built along the Kestrel's limit or an AppContext switch. Please let me know if building such a would be preference. But note, that previously "possible" use-cases still work the same as before, so the switch would only control if large headers are allowed or not -> hence a limit might be suitable option.
I did not come across compression/no-compression on this path. HPack encodes the header values into this buffer.
The header is written to a buffer, which is split into CONTINUATION frames, so it does not matter if the name or the value is being oversized.
MaxRequestHeaderFieldSize ? -> I need to test the behavior.
It works on anything that HPack writes to my understanding. I will double confirm.
-> If the long one does not fit in the same frame, yes, the initial header will be sent in a tiny frame. This is even true for the "current" behavior. |
src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs
Outdated
Show resolved
Hide resolved
src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs
Outdated
Show resolved
Hide resolved
src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs
Outdated
Show resolved
Hide resolved
Background: I've looked at HTTP2 headers a lot. I wrote some of the dynamic support and rewrote the writer and parser at one point. I haven't looked through the code in detail yet. Some initial thoughts:
|
@ladeak Thanks! Your responses make sense.
|
I don't think we need a switch. If this lands in a mid .NET 9 preview, then it should go through a lot of use and testing before GA. For example, the work David did to rewrite response writing in .NET 7(?) was much more complex and we didn't have a fallback. |
Good enough for me. I hadn't considered how well exercised this code is by TechEmpower, etc. |
@ladeak Did you receive the initial feedback you needed? Is this ready for a full review or are you still working on it. There's no rush - I just wondered whether the next steps were our responsibility. Thanks! |
@amcasey Going to come back tomorrow with some findings. |
Discussion about the header size limits: as I understood there is a general desire to have a limit. However, response headers mostly depend on the app, and the way app handles headers. I have not found limits for HTTP/1.1. An empty/default Kestrel ASP.NET Core webapp also allows as large headers as desired with h1. On the consumer-side I ran into limits though (H1): .NET When I run the app with IISExpress, it seems to only returns the remainder of 64k. (header mod 65536). @amcasey , the following questions would need further clarification:
|
Since the spec doesn't give a limit, I think we need to give users a way to override whatever we decide. I'm not sure why it would be specific to http/2 though - presumably the same concern applies to http/1.1 or http/3. My first thought for a default would be double If we were to decide this shouldn't get a public API, I'd want to go even higher - maybe 10 MB. |
@amcasey I think I have not thought about http/1.1 about a limit, given it does not have currently, and setting 64KB would be breaking, wouldn't it? (I am not sure how difficult it could be to implement this for http/1.1, http/3 looks similar to h2 in code structure. But it makes sense from the point of view you describe that it could be a setting that applies to all versions of http. A question if it is public: should it apply to a single header or to the total headers. Consumers HttpClient and Edge had a total while curl per header limit. |
Because we're guarding against resource utilization rather than malformed responses, I think the limit should apply to the total size of all headers, rather than to the size of any individual header. Similarly, if we're reducing the whole detection mechanism to a single number, I would expect trailers to be included. I'm open to feedback on both.
Yes, it would. I think we generally accept breaks in service of DoS prevention, but I agree that this is a strong argument for choosing a default that is larger than we expect anyone to use in practice. If we felt really strongly about this, I could live with adding limits to both http/2 and http/3 and not to |
@amcasey , I added a commit that has a new limit on One thing I found on the way: I would expect similar implementation would be needed on H/1.1 and H/3 (because the public setting is on Kestrel level), hence not sure if a public limit is worth all the complexity to be added. Maybe a setting called |
I'm okay with breaking |
@ladeak I saw your update, but I don't think I'm going to have a chance to look at it today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not feeling all that well (and might be out at the beginning of next week), so I might not be thinking very clearly, but I think I need an overview of the relationships among WriteResponseHeadersUnsynchronized
, WriteDataAndTrailersAsync
, SplitHeaderFramesToOutput
, and FinishWritingHeadersUnsynchronized
.
It looks like all calls to FinishWritingHeadersUnsynchronized
are immediately preceded by calls to SplitHeaderFramesToOutput
, but it also calls SplitHeaderFramesToOutput
? Maybe that call should be moved into FinishWritingHeadersUnsynchronized
?
Why do both WriteDataAndTrailersAsync
and FinishWritingHeadersUnsynchronized
have loops that grow the buffer even though one calls the other? Naively, it feels like exactly one method should be responsible for looping and others should delegate to it as necessary. Maybe I'm missing some functional requirement or perf benefit.
This change feels close - looking forward to catching up next week.
src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs
Outdated
Show resolved
Hide resolved
src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs
Outdated
Show resolved
Hide resolved
{ | ||
// More headers sent in CONTINUATION frames. | ||
_headerEncodingBuffer.Advance(payloadLength); | ||
SplitHeaderFramesToOutput(streamId, endOfHeaders: false, isFramePrepared: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The frame that has been prepared is the initial HEADERS
and not the first CONTINUATION
, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in this branch BeginEncodeHeaders
returns MoreHeaders
, so it means we can write the current buffer
as is to the output and continue writing the rest of the headers with FinishWritingHeadersUnsynchronized
.
This branch could also write the frame directly as I don't think in this case we ever get a buffer
larger to a single frame (so we could drop SplitHeaderFramesToOutput
and write the frame directly as in the Done
case but without the END_HEADERS
flag). Note that this is not true when writing the TRAILERS, because that case might return BufferTooSmall
already for the initial write to the buffer frame. - I am happy to change this branch as I propose here, if you think it worth it.
In this case the frame is prepared as a HEADERS
frame in line 507, it is sent and further CONTINUATION
frames will be used by FinishWritingHeadersUnsynchronized
. I think the comment here might be misleading, I will clarify that.
@amcasey I wish you a smooth recovery.
Yes, I actually used to have an additional callsite for I removed this initial call to But of course, I am happy to refactor 'back' as suggested.
Note, when writing the response headers the loop could be dropped because there is a status code, so we never end up with empty HEADERS frame. I am looking at the difference between |
@ladeak I sketched out the code organization I had in mind. It's probably wrong - I didn't even try to compile it - but I think it illustrates my two goals: making headers and trailers look similar and only manipulating the buffer size in a single method. It also restores the fast path in the trailers case - I think that may have been lost. What do you think? I wrote it as a way to clarify my questions and not as a proposal. Same caveat as before: I'm not feeling very well, so it may not make sense at all. 😆 |
@amcasey thank you, I see your idea now. With an initial read I see two edge cases:
I wish you getting better. Please note, that I also have an activity for the next 5 weeks 3 days a week that slightly reduces my normal capacity (it started 2 weeks ago). |
Probably not the only oversight. 😆
Fair point, but I'm a little concerned about the buffer never shrinking (AFAICT).
I'm interested to know what you find.
Thanks!
No worries. We're very grateful for the work you've already put in - do what works for you. |
@amcasey I made the 'Sketch' working and passing tests on my machine. I push changes with the latest commit. I would add maybe 1-2 additional tests to be more comfortable with this implementation though. (ie. for the new I have not yet handled the fact that EDIT: I could not use header's Reset because it does not reset the unKnownEnumerator's position. EDIT2: I realize that |
…wner to not to ddos itself
I have updated the Benchmarks, and I noticed a significant degradation on the main branch. My measurements from earlier this month vs now: Before changes (updated 2024. 05. 04. main branch):
Before changes (updated 2024. 05. 22. main branch):
|
Nicer code definitely doesn't justify that kind of perf hit, but I'd be interested to know what's causing it. Maybe the slow path just needs to be in a separate helper method to make the JIT happy? |
I'm not sure I understand what's going wrong here. It sounds like maybe
I think this is saying that resetting the enumerator would have been an alternative to introducing
I like the pool approach. It seems like it also prevents the frame writer from holding onto a huge buffer for the lifetime of the connection (though I agree that that it's likely to want a big buffer for the next request). Your changes make sense to me but I didn't read line-by-line because it seems like you're still investigating things (esp perf?). Let me know if there's something specific you want me to look at. Thanks again! |
For trailer, even the first write might end up with
I am currently not investigating why the |
} | ||
if (writeResult == HPackHeaderWriter.HeaderWriteResult.Done) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One consideration could be to only return largeHeaderBuffer
to the pool in case the method returns here.
That way it could be continued to be used by the code below (L640 onwards), but I found the current implementation easier to follow.
Allow the HTTP2 encoder to split headers across frames
Enable Kestrel's HTTP2 to split large HTTP headers into HEADER and CONTINUATION frames.
Description
Kestrel's HTTP2 implementation limits the max header size to the size of the frame size. If a header's size is larger than the frame size, it throws an exception:
throw new HPackEncodingException(SR.net_http_hpack_encode_failure);
. RFC 7540 allows the headers to be split into a HEADER frame and CONTINUATION frames.Before the change Kestrel only used CONTINUATION frames when headers fitted fully within a frame.
This PR changes the above behavior by allowing to split even a single HTTP header into multiple frames. It uses an
ArrayPool<byte>
, to back the buffer used by the HPack encoder.When the HPack encoder reports that a single header does not fit the available buffer, the size of the buffer is increased. Note, that the .NET runtime implementation on HttpClient writes all headers to a single buffer before pushing it onto the output, contrary to this implementation that keeps the semantics of Kestrel. It only increases the buffer when a single header fails to be written to the output, otherwise the old behavior is kept. My intention was to keep this behavior so that memory-wise it does not use more memory than the single largest header or the max frame size.
With this PR
HPackHeaderWriter
uses an enum to tellHttp2FrameWriter
to increase the buffer or not. When the buffer is too small, its size is doubled.This behavior is also implemented for trailers. Note that in case of headers, the HEADER frame is never empty because of the response status, while this is not true for trailers. Hence there is a subtle difference when getting the buffer for the initial frame of a header vs. a trailer.
I updated existing tests asserting the previous behavior and added new tests to validate the proposed changes.
Performance
Performance-wise the change is not expected to increase throughput (given it must do more to enable this use-case) but the goal is to have the 'slow-path' is only in the case when a single header is too large. I used the existing
Http2FrameWriterBenchmark
to compare performance before and after:Before changes (updated 2024. 05. 22. main):
After changes using arraypool instead of arraybufferwriter
Fixes #4722