Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/1 balsa parser prematurely purges parsed headers resulting in status monitor failure #34096

Closed
RenjieTang opened this issue May 11, 2024 · 6 comments · Fixed by #34140
Closed

Comments

@RenjieTang
Copy link
Contributor

During UpstreamHttp11ConnectSocket::isValidConnectResponse(), a framer is used under the hood to parse the header. Later its statusCode() is used to determine validity.

However, when the response header contains content-length: 0, BalsaParser::MessageDone() will be invoked which purges the parsed header. Later when statusCode() is called, since the header is empty, the default 0 status code is returned and the response is mistaken as invalid.

This bug is sneaky because when content-length isn't present, the framer will wait to read more data until the connection close, and thus MessageDone() is never called.

I think the correct fix should be to not reset the header when the framer is cleared, because the header is owned by the BalsaParser.

@RenjieTang RenjieTang added bug triage Issue requires triage labels May 11, 2024
@RyanTheOptimist
Copy link
Contributor

@bencebeky

@alyssawilk
Copy link
Contributor

alyssawilk commented May 13, 2024

Also, looks to me like onStatus is returning the reason phrase, not the status code? rename or fix is in order?

@ravenblackx ravenblackx added area/http and removed triage Issue requires triage labels May 13, 2024
@bencebeky
Copy link
Contributor

Also, looks to me like onStatus is returning the reason phrase, not the status code? rename or fix is in order?

That is correct. This is pre-existing from before BalsaParser was added. What makes most sense to me is to lump fixing this with other cleanup tasks in the far future when http-parser gets removed.

@bencebeky
Copy link
Contributor

During UpstreamHttp11ConnectSocket::isValidConnectResponse(), a framer is used under the hood to parse the header. Later its statusCode() is used to determine validity.

However, when the response header contains content-length: 0, BalsaParser::MessageDone() will be invoked which purges the parsed header. Later when statusCode() is called, since the header is empty, the default 0 status code is returned and the response is mistaken as invalid.

This bug is sneaky because when content-length isn't present, the framer will wait to read more data until the connection close, and thus MessageDone() is never called.

I think the correct fix should be to not reset the header when the framer is cleared, because the header is owned by the BalsaParser.

Thank you for raising this issue.

I don't think the right approach is to not reset BalsaHeaders along with BalsaFrame. BalsaParser is designed to parse multiple messages, so it makes sense to reset BalsaHeaders and BalsaFrame at the same time. I propose saving the value of the status code when the headers are parsed, and keeping it either until the first byte of the next message is parsed, or as long as possible, until the headers of the next message are parsed.

Another approach would be not to reset BalsaFrame and BalsaHeaders when parsing the message is complete. I think as long as we only need access to a single integer, it is better to reset them early, because they take up memory, but this would also be a viable approach if more state would need to be accessed.

@alyssawilk
Copy link
Contributor

is it just the single integer? looks like isHttp11 also refers to headers, as does contentLength and possibly others (I stopped at n=2). I think it'd be safer to keep headers around rather than trying to latch all of the fields and hope we don't forget to do so in future.

@bencebeky
Copy link
Contributor

is it just the single integer? looks like isHttp11 also refers to headers, as does contentLength and possibly others (I stopped at n=2). I think it'd be safer to keep headers around rather than trying to latch all of the fields and hope we don't forget to do so in future.

Oh thanks, that's a very good point.

alyssawilk pushed a commit that referenced this issue May 15, 2024
Delay resetting BalsaFrame until the first byte of the next message is parsed. This is necessary for being able to access information about the parsed headers after the message is fully parsed, for example, response status code, and content-length and transfer-encoding headers.

Fixes #34096

Balsa implementation tracking issue: #21245

Commit Message: [balsa] Delay resetting BalsaFrame.
Additional Description:
Risk Level: low
Testing: test/extensions/transport_sockets/http_11_proxy:connect_test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a
Runtime guard: yes

Signed-off-by: Bence Béky <bnc@google.com>
Co-authored-by: Bence Béky <bnc@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants