Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable llhttp for HTTP parsing #6713

Merged
merged 9 commits into from Apr 23, 2024
Merged

Enable llhttp for HTTP parsing #6713

merged 9 commits into from Apr 23, 2024

Conversation

sgallagher
Copy link
Contributor

Fixes: #6074

src/libgit2/transports/httpclient.c Outdated Show resolved Hide resolved
@sergio-correia
Copy link
Contributor

src/util/net.c also has an http_parser.h include that should likely be removed.

ci/docker/fedora Outdated
krb5-workstation \
krb5-libs \
krb5-devel \
pcre2-devl \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: pcre2-devel

@sergio-correia
Copy link
Contributor

sergio-correia commented Feb 2, 2024

@ethomson: Hi, would you mind running the CI for this PR, please, so we see what is failing? Thanks in advance.

@sgallagher
Copy link
Contributor Author

@sergio-correia It looks like one of the issues here is that llhttp_execute() returns an error code whereas http_parser_execute() returned the number of bytes that it parsed. This then fails a check later on because error code 0 definitely doesn't match the expected number of bytes.

This also appears to be a problem in your patch for tang.

@sergio-correia
Copy link
Contributor

@sergio-correia It looks like one of the issues here is that llhttp_execute() returns an error code whereas http_parser_execute() returned the number of bytes that it parsed. This then fails a check later on because error code 0 definitely doesn't match the expected number of bytes.

This also appears to be a problem in your patch for tang.

Nice catch, thanks, @sgallagher! I submitted a PR to your branch that should hopefully address this and the remaining issues. I will do a follow-up patch for tang later on.

sergio-correia and others added 2 commits February 7, 2024 17:00
So that we can test a build with llhttp instead of http-parser.

Co-authored-by: Stephen Gallagher <sgallagh@redhat.com>
Signed-off-by: Sergio Correia <scorreia@redhat.com>
Signed-off-by: Stephen Gallagher <sgallagh@redhat.com>
Fixes: libgit2#6074

We now try to use llhttp by default, falling back to http-parser
if the former is not available.

As a last resort, we use the bundled http-parser.

Co-authored-by: Sergio Correia <scorreia@redhat.com>
Signed-off-by: Stephen Gallagher <sgallagh@redhat.com>
Signed-off-by: Sergio Correia <scorreia@redhat.com>
@sgallagher
Copy link
Contributor Author

Thanks to a lot of help from @sergio-correia, I think this is in proper shape for review. @ethomson if you could kick off the pipeline check, I'd appreciate it.

@sgallagher sgallagher changed the title DRAFT: Enable llhttp for HTTP parsing Enable llhttp for HTTP parsing Feb 7, 2024
@sgallagher
Copy link
Contributor Author

I can't reproduce that valgrind error locally. Given where it's located in the code, I suspect it may have been a bug in glibc getaddrinf() that has been fixed in the meantime. Could you retry the check?

@ethomson
Copy link
Member

Hiya - many apologies for the delay. Thank you so much for all this — one question I have is whether we really need to support both llhttp and the old Node http_parser? It would simplify the code quite a bit to only support the newfangled llhttp, which we could bundle as a dependency (removing the old Node http_parser).

If there's a reason to keep the old support, I'm willing. Just curious.

Include llhttp as a bundled dependency with the aim to use it as our
default http parser, removing the now-unmaintained Node.js http-parser.
Avoid #ifdef's in httpclient.c, and move http parsing into its own file.
Users can still use the legacy Node.js http-parser library, but now we
bundle llhttp and prefer it.
Use fedora's valgrind instead of trying to build our own; omit false
positive leaks in getaddrinfo;
@sgallagher
Copy link
Contributor Author

Hiya - many apologies for the delay. Thank you so much for all this — one question I have is whether we really need to support both llhttp and the old Node http_parser? It would simplify the code quite a bit to only support the newfangled llhttp, which we could bundle as a dependency (removing the old Node http_parser).

If there's a reason to keep the old support, I'm willing. Just curious.

I mostly left it alone because I wasn't prepared to try to entirely extract http-parser since you were carrying a bundled copy. If you're willing to make the transition directly, I'm fully on-board. Upstream http-parser has been dead for four years and I have a sneaking suspicion that there are probably a number of unaddressed security vulnerabilities there that have gone unreported.

@ethomson
Copy link
Member

Thanks for tackling this PR - it's been on my wish-list for a while to move to llhttp for a plurality of reasons.

After thinking about it a bit more, I realized that other distros should be able to continue to use their existing http-parsers. So I kept the option to use http-parser or llhttp. But I added llhttp as the bundled option, and removed the http-parser that we were bundling. I also brought some of our abstractions a bit more in line with libgit2's typical code style. This lets the http client talk to a generic http parser without having to know what kind of http parser it is, and encapsulates the http parser switching logic in http parser code.

Thanks again! Excited to see this.

@sgallagher
Copy link
Contributor Author

Thanks for tackling this PR - it's been on my wish-list for a while to move to llhttp for a plurality of reasons.

Happy to help :)

After thinking about it a bit more, I realized that other distros should be able to continue to use their existing http-parsers. So I kept the option to use http-parser or llhttp. But I added llhttp as the bundled option, and removed the http-parser that we were bundling. I also brought some of our abstractions a bit more in line with libgit2's typical code style. This lets the http client talk to a generic http parser without having to know what kind of http parser it is, and encapsulates the http parser switching logic in http parser code.

Thanks for get it over the finish line. Obviously it's my first contribution to libgit2, so I appreciate the help.

Thanks again! Excited to see this.

Us too! This is (I think) the last major package in Fedora that's pulling in http-parser, so getting rid of that dependency here is a big deal for us.

@ethomson
Copy link
Member

Thanks for get it over the finish line. Obviously it's my first contribution to libgit2, so I appreciate the help.

Appreciate your kind words - I tend to just jump in and meddle in PRs which is not necessarily the most constructive (or polite) way to work.

Can you triple check me and make sure that I didn't break anything for your use case? I'll hit the shiny green button if it's all good.

@sgallagher
Copy link
Contributor Author

Everything looks good to me, though I am curious why you moved the CI run for Fedora to Nightly rather than as part of the merge request workflow. Is it just too slow?

@ethomson
Copy link
Member

Ah right. Yeah, I want to keep the CI runs minimal enough to just cover the various things that we support, but still be fast. And then nightlies can be more exhaustive.

There's already more than I would like in the CIs, they're awfully slow. I'm happy to rethink what we have where.

@ethomson ethomson merged commit b739aca into libgit2:main Apr 23, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http-parser library is unmaintained
4 participants