Skip to content
This repository has been archived by the owner on Nov 6, 2022. It is now read-only.

Suggestion for making F_SKIPBODY more useful #505

Open
pkholland opened this issue Apr 4, 2020 · 0 comments
Open

Suggestion for making F_SKIPBODY more useful #505

pkholland opened this issue Apr 4, 2020 · 0 comments

Comments

@pkholland
Copy link

pkholland commented Apr 4, 2020

The current usage of the F_SKIPBODY bit in parser->flags is documented as having to do with parsing HEAD responses. Those will generally have a content-length field, but no matching body. So one's implementation of on_headers_complete can return a 1, telling the parser that you want it to skip attempting to parse the message body. When you return 1 from this function it sets this bit in parser->flags (currently line 1851 in http_parser.c).

But there are other reasons that one might want to use the parser to parse just the headers and then do something else for the body. The current behavior when F_SKIPBODY gets set is strange, and somewhat unpredictable. After F_SKIPBODY is set you quickly get to line 1886 which resets the parser state to NEW_MESSAGE, and then continues on with the main loop. If the original call to http_parser_execute happened to have read and passed more data than where the body starts, the main parsing loop will then attempt to parse the body of the message as if it were some kind of next message. This frequently fails on the first byte because it doesn't match any of the legal G, P, D, etc... bytes that can start a message, and in that case doesn't consume that byte. In those cases http_parser_execute returns in a non-errored state and its return value tells the caller where the message body starts. But when the message body starts with a G (and not followed by E), for example, the parser consumes that first G and returns an offset to the second byte of the message body.

I feel like it is much more sensible if the code was something like:

1885        if (parser->flags & F_SKIPBODY) {
1886          UPDATE_STATE(NEW_MESSAGE());
1887          CALLBACK_NOTIFY(message_complete);
1888          len = (p - data) + 1;    // <--- add this line to stop the parsing loop

That would declare that if on_headers_complete returns 1, the parser will not attempt to read or parse any data past the end of the headers.

Thoughts?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant