New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unix domain socket support to HTTP transport #1681
base: master
Are you sure you want to change the base?
Conversation
Welcome to GitGitGadgetHi @lcfyi, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that either:
You can CC potential reviewers by adding a footer to the PR description with the following syntax:
Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel
Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment If you want to see what email(s) would be sent for a After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail). If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the curl -g --user "<EMailAddress>:<Password>" \
--url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):
To send a new iteration, just add another PR comment with the contents: Need help?New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join. You may also be able to find help in real time in the developer IRC channel, |
There are issues in commit 40c6598: |
There are issues in commit 44fc8b9: |
There are issues in commit 3c8aa87: |
ff2e060
to
3e53163
Compare
/allow |
User lcfyi is now allowed to use GitGitGadget. |
/preview |
Preview email sent as pull.1681.git.git.1708506711460.gitgitgadget@gmail.com |
/submit |
Submitted as pull.1681.git.git.1708506863243.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Eric Wong wrote (reply to this): Leslie Cheng via GitGitGadget <gitgitgadget@gmail.com> wrote:
> Subject: Re: [PATCH] Add unix domain socket support to HTTP transport.
No need for trailing `.' in commit message titles
<snip>
> @@ -455,6 +458,20 @@ static int http_options(const char *var, const char *value,
> return 0;
> }
>
> + if (!strcmp("http.unixsocket", var)) {
> +#ifdef GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH
> +#ifndef NO_UNIX_SOCKETS
> + return git_config_string(&curl_unix_socket_path, var, value);
> +#else
> + warning(_("Unix socket support unavailable in this build of Git"));
> + return 0;
> +#endif
> +#else
> + warning(_("Unix socket support is not supported with cURL < 7.40.0"));
> + return 0;
> +#endif
> + }
Personally, I'd hoist the #ifdef part into a standalone function
since I find mixing CPP and C conditionals confusing.
disclaimer: I'm an easily confused person and don't usually
program in C, though.
<snip>
> --- /dev/null
> +++ b/t/t5565-http-unix-domain-socket.sh
<snip>
> + nc -klU "$socket_path" <tcp_to_uds >uds_to_tcp &
> + UDS_PID=$!
> +
> + nc "$host" "$port" >tcp_to_uds <uds_to_tcp &
`nc' isn't widely installed, its supported flags vary between
implementations, and our test suite doesn't currently use it.
I suggest either using a small bit of Perl or writing a t/helper
program to do its job.
Finally, hard tabs should be used for indentation throughout.
I'll wait on others to comment since I haven't looked at git
hacking in a while.
Anyways, I think this feature could be useful for me, too :>
Thanks. |
User |
3e53163
to
e4e0418
Compare
e4e0418
to
8548efc
Compare
On the Git mailing list, Leslie Cheng wrote (reply to this): > No need for trailing `.' in commit message titles
Will fix in the next patch, sorry!
> Personally, I'd hoist the #ifdef part into a standalone function
> since I find mixing CPP and C conditionals confusing.
> > disclaimer: I'm an easily confused person and don't usually
> program in C, though.
I considered extracting it out, but the other conditionals in this function follow a similar pattern so I didn't want to change it. However, my use here is also the first time there's both an #ifdef and nested #ifndef, which I agree makes it a bit confusing to grok.
I'm open to changing it, but I'll let it sit and marinate for a bit.
> `nc' isn't widely installed, its supported flags vary between
> implementations, and our test suite doesn't currently use it.
> I suggest either using a small bit of Perl or writing a t/helper
> program to do its job.
> > Finally, hard tabs should be used for indentation throughout.
> > I'll wait on others to comment since I haven't looked at git
> hacking in a while.
> > Anyways, I think this feature could be useful for me, too :>
> Thanks.
Good catch, I'll fix in the next patch. I've subbed `nc` out for a simple Perl script to pipe back and forth, just making sure CI is happy about this before submitting. |
User |
This changeset introduces an `http.unixSocket` option so that users can proxy their git over HTTP remotes to a unix domain socket. In terms of why, since UDS are local and git already has a local protocol: some corporate environments use a UDS to proxy requests to internal resources (ie. source control), so this change would support those use-cases. This proxy can occasionally be necessary to attach MFA tokens or client certificates for CLI tools. The implementation leverages `--unix-socket` option [0] via the `CURLOPT_UNIX_SOCKET_PATH` flag available with libcurl [1]. `GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH` and `NO_UNIX_SOCKETS` were kept separate so that we can spit out better error messages for users if git was compiled with `NO_UNIX_SOCKETS`. [0] https://curl.se/docs/manpage.html#--unix-socket [1] https://curl.se/libcurl/c/CURLOPT_UNIX_SOCKET_PATH.html Signed-off-by: Leslie Cheng <leslie@lc.fyi>
8548efc
to
2af5cc8
Compare
/submit |
Submitted as pull.1681.v2.git.git.1708653536115.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Leslie Cheng via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Subject: Re: [PATCH v2] Add unix domain socket support to HTTP transport
Perhaps
Subject: [PATCH] http: enable proxying via unix-domain socket
to follow the usual "<area>: <description>" format?
> From: Leslie Cheng <leslie.cheng5@gmail.com>
>
> This changeset introduces an `http.unixSocket` option so that users can
"This changeset introduces" -> "Introduce". There may be other
gotchas that might use help from Documentation/SubmittingPatches,
but I didn't read too carefully.
Besides, it is a single patch, not a set of changes ;-).
`http.unixSocket` is a configuration variable. It may be confusing
to use the word "option". Speaking of options, shouldn't there be a
command line option that overrides the configured value?
We should honor the usual http.<url>.VARIABLE convention where
http.<url>.VARIABLE that is destination-specific overrides a more
generic http.VARIABLE configuration variable.
> proxy their git over HTTP remotes to a unix domain socket. In terms of
> why, since UDS are local and git already has a local protocol: some
> corporate environments use a UDS to proxy requests to internal resources
> (ie. source control), so this change would support those use-cases. This
"ie." -> "i.e.,"?
> proxy can occasionally be necessary to attach MFA tokens or client
> certificates for CLI tools.
>
> The implementation leverages `--unix-socket` option [0] via the
> `CURLOPT_UNIX_SOCKET_PATH` flag available with libcurl [1].
There is a feature in libcURL library, that is enabled by setting
the CURLOPT_UNIX_SOCKET_PATH option via the curl_easy_setopt() call,
and their command line utility. You do the same to implement this
feature. But when you are not adding "--unix-socket" option to any
of our commands, mention of that option name makes it more confusing
than necessary.
The usual way to compose a log message of this project is to
- Give an observation on how the current system work in the present
tense (so no need to say "Currently X is Y", just "X is Y"), and
discuss what you perceive as a problem in it.
- Propose a solution (optional---often, problem description
trivially leads to an obvious solution in reader's minds).
- Give commands to the codebase to "become like so".
in this order.
How about following that convention, perhaps like:
In some corporate environments, the proxy server listens to a
local unix domain socket for requests, instead of listening to a
network port. Even though we have http.proxy (and more
destination specific http.<url>.proxy) configuration variables
to specify the network address/port of a proxy, that would not
help if your proxy does not listen to the network.
Introduce an `http.unixSocket` (and `http.<url>.unixSocket`)
configuration variables that specify the path to a unix domain
socket for such a proxy. Recent versions of libcURL library
added CURLOPT_UNIX_SOCKET_PATH to support "curl --unix-socket
<path>"---use the same mechanism to implement it.
> `GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH` and `NO_UNIX_SOCKETS` were kept
> separate so that we can spit out better error messages for users if git
> was compiled with `NO_UNIX_SOCKETS`.
Unlike NO_UNIX_SOCKETS, GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH is
entirely internal to your implementation and not surfaced to neither
the end-users or the binary packagers. Because of that, I suspect
that any description that has to use that name probably falls on the
other side of "too much implementation details" to be useful to help
future developers..
Besides, I suspect that GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH might
not be the optimum approach. See below.
> diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt
> index 2d4e0c9b869..bf48cbd599a 100644
> --- a/Documentation/config/http.txt
> +++ b/Documentation/config/http.txt
> @@ -277,6 +277,11 @@ http.followRedirects::
> the base for the follow-up requests, this is generally
> sufficient. The default is `initial`.
>
> +http.unixSocket::
> + Connect through this Unix domain socket via HTTP, instead of using the
> + network. If set, this config takes precendence over `http.proxy` and
> + is incompatible with the proxy options (see `curl(1)`).
Talking about precedence between this and http.proxy is good thing,
but one very important piece of information is missing. What value
does it take?
The absolute path of a unix-domain socket to pass the HTTP
traffic over, instead of using the network.
or something, perhaps?
> diff --git a/git-curl-compat.h b/git-curl-compat.h
> index fd96b3cdffd..f0f3bec0e17 100644
> --- a/git-curl-compat.h
> +++ b/git-curl-compat.h
> @@ -74,6 +74,13 @@
> #define GIT_CURL_HAVE_CURLE_SSL_PINNEDPUBKEYNOTMATCH 1
> #endif
>
> +/**
> + * CURLOPT_UNIX_SOCKET_PATH was added in 7.40.0, released in January 2015.
> + */
> +#if LIBCURL_VERSION_NUM >= 0x074000
> +#define GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH 1
> +#endif
The "HAVE" part in GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH is a
statement of a fact. If the version of cURL library we have is
certain value, we have it. OK.
> diff --git a/http.c b/http.c
> index e73b136e589..8cfdcaeac82 100644
> --- a/http.c
> +++ b/http.c
> @@ -79,6 +79,9 @@ static const char *http_proxy_ssl_ca_info;
> static struct credential proxy_cert_auth = CREDENTIAL_INIT;
> static int proxy_ssl_cert_password_required;
It might make the code easier to follow if you did:
#if !defined(NO_CURLOPT_UNIX_SOCKET_PATH) && !defined(NO_UNIX_SOCKETS)
#if defined(GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH)
#define USE_CURLOPT_UNIX_SOCKET_PATH
#endif
#endif
The points are
(1) the users can decline to use CURLOPT_UNIX_SOCKET_PATH while
still using unix domain socket for other purposes, and
(2) you do not have to care if you HAVE it or not most of time;
what matters more often is if the user told you to USE it.
Hmm?
> +#if defined(GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH) && !defined(NO_UNIX_SOCKETS)
> +static const char *curl_unix_socket_path;
> +#endif
The guard here would become "#ifdef USE_CURLOPT_UNIX_SOCKET_PATH" if
we wanted this to be conditional, but I think it is easier to make
the variable unconditionally available; see below.
> @@ -455,6 +458,20 @@ static int http_options(const char *var, const char *value,
> return 0;
> }
>
> + if (!strcmp("http.unixsocket", var)) {
> +#ifdef GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH
> +#ifndef NO_UNIX_SOCKETS
> + return git_config_string(&curl_unix_socket_path, var, value);
> +#else
> + warning(_("Unix socket support unavailable in this build of Git"));
> + return 0;
> +#endif
> +#else
> + warning(_("Unix socket support is not supported with cURL < 7.40.0"));
> + return 0;
> +#endif
> + }
In general, it is inadvisable to issue a warning in the codepath
that parses configuration variables, as the values we read may not
be necessarily used. We could instead accept the given path into a
variable unconditionally, and complain only before it gets used,
near the call to curl_easy_setopt().
> if (!strcmp("http.cookiefile", var))
> return git_config_pathname(&curl_cookie_file, var, value);
> if (!strcmp("http.savecookies", var)) {
> @@ -1203,6 +1220,12 @@ static CURL *get_curl_handle(void)
> }
> init_curl_proxy_auth(result);
>
> +#if defined(GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH) && !defined(NO_UNIX_SOCKETS)
> + if (curl_unix_socket_path) {
> + curl_easy_setopt(result, CURLOPT_UNIX_SOCKET_PATH, curl_unix_socket_path);
> + }
> +#endif
Here, the guard may become more like
if (curl_unix_socket_path) {
#ifdef USE_CURLOPT_UNIX_SOCKET_PATH
curl_easy_setopt(...);
#elif defined(NO_CURLOPT_UNIX_SOCKET_PATH) || defined(NO_UNIX_SOCKETS)
warning(_("this build disables the unix-domain-socket feature"));
#elif
warning(_("your cURL library is too old"));
#endif
} |
On the Git mailing list, Junio C Hamano wrote (reply to this): Junio C Hamano <gitster@pobox.com> writes:
> "Leslie Cheng via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> Subject: Re: [PATCH v2] Add unix domain socket support to HTTP transport
>
> Perhaps
>
> Subject: [PATCH] http: enable proxying via unix-domain socket
>
> to follow the usual "<area>: <description>" format?
>
>> From: Leslie Cheng <leslie.cheng5@gmail.com>
>>
>> This changeset introduces an `http.unixSocket` option so that users can
>
> "This changeset introduces" -> "Introduce". There may be other
> gotchas that might use help from Documentation/SubmittingPatches,
> but I didn't read too carefully.
>
> Besides, it is a single patch, not a set of changes ;-).
>
> `http.unixSocket` is a configuration variable. It may be confusing
> to use the word "option". Speaking of options, shouldn't there be a
> command line option that overrides the configured value?
>
> We should honor the usual http.<url>.VARIABLE convention where
> http.<url>.VARIABLE that is destination-specific overrides a more
> generic http.VARIABLE configuration variable.
Clarification. I know the above is automatically achieved, given
the way we have laid urlmatch foundation to allow easy parsing for
configuration variables structured this way. I did not mean that
you'd need to do anything special; rather, I meant that we should
advertise that we do in the commit log message.
|
On the Git mailing list, Leslie Cheng wrote (reply to this): On 2024-02-23 12:37 a.m., Junio C Hamano wrote:
> How about following that convention, perhaps like:
>
> In some corporate environments, the proxy server listens to a
> local unix domain socket for requests, instead of listening to a
> network port. Even though we have http.proxy (and more
> destination specific http.<url>.proxy) configuration variables
> to specify the network address/port of a proxy, that would not
> help if your proxy does not listen to the network.
>
> Introduce an `http.unixSocket` (and `http.<url>.unixSocket`)
> configuration variables that specify the path to a unix domain
> socket for such a proxy. Recent versions of libcURL library
> added CURLOPT_UNIX_SOCKET_PATH to support "curl --unix-socket
> <path>"---use the same mechanism to implement it.
This is excellent, thanks for the guidance (and all the other
suggestions prior)! I'll update in the next patch.
> Unlike NO_UNIX_SOCKETS, GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH is
> entirely internal to your implementation and not surfaced to neither
> the end-users or the binary packagers. Because of that, I suspect
> that any description that has to use that name probably falls on the
> other side of "too much implementation details" to be useful to help
> future developers..
That's reasonable, I figured it would fit as a cover letter detail but I
agree it's not relevant as a commit message that lives in the history of
the project. I'll also update this in the next patch.
> Talking about precedence between this and http.proxy is good thing,
> but one very important piece of information is missing. What value
> does it take?
>
> The absolute path of a unix-domain socket to pass the HTTP
> traffic over, instead of using the network.
>
> or something, perhaps?
I like that wording, I'll update in the next patch.
> It might make the code easier to follow if you did:
>
> #if !defined(NO_CURLOPT_UNIX_SOCKET_PATH) && !defined(NO_UNIX_SOCKETS)
> #if defined(GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH)
> #define USE_CURLOPT_UNIX_SOCKET_PATH
> #endif
> #endif
>
> The points are
>
> (1) the users can decline to use CURLOPT_UNIX_SOCKET_PATH while
> still using unix domain socket for other purposes, and
>
> (2) you do not have to care if you HAVE it or not most of time;
> what matters more often is if the user told you to USE it.
>
> Hmm?
Do you think this functionality is worth adding another macro to
conditionally include it in the build? It felt out-of-the-way enough
that we could just use the same `NO_UNIX_SOCKETS` macro to control for
environments that don't support unix domain sockets.
>> +#if defined(GIT_CURL_HAVE_CURLOPT_UNIX_SOCKET_PATH) &&
!defined(NO_UNIX_SOCKETS)
>> +static const char *curl_unix_socket_path;
>> +#endif
>
> The guard here would become "#ifdef USE_CURLOPT_UNIX_SOCKET_PATH" if
> we wanted this to be conditional, but I think it is easier to make
> the variable unconditionally available; see below.
Agreed in general, I was looking to other patterns for conditional
variables in file, e.g.
https://github.com/git/git/blob/3c2a3fdc388747b9eaf4a4a4f2035c1c9ddb26d0/http.c#L66-L68
But, revisiting, this looks like an exception rather than the norm.
> In general, it is inadvisable to issue a warning in the codepath
> that parses configuration variables, as the values we read may not
> be necessarily used. We could instead accept the given path into a
> variable unconditionally, and complain only before it gets used,
> near the call to curl_easy_setopt().
Similar to above, I followed what was already done for certain
configuration variables (e.g.
https://github.com/git/git/blob/3c2a3fdc388747b9eaf4a4a4f2035c1c9ddb26d0/http.c#L485-L501),
but I agree with your feedback since this would result in constant warnings.
To summarize, I'll do the following in the next patch:
- change the wording of the commit message
- move the conditional variables and parsing to a check at
`curl_easy_setopt()` time
I'm still undecided on whether I should introduce another macro
specifically for this functionality, and I'd like to hear your thoughts
on why it might make sense.
|
Changes since v1:
nc
to proxy between UDS and TCP socket; I chose not to split this out into a library since its use is hyper-specific and has a dependency onlib-httpd.sh
cc: Eric Wong e@80x24.org
cc: Leslie Cheng leslie.cheng5@gmail.com