-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curl.perform() blocks SIGINT during the start of a SOCKS transfer #706
Comments
Does it make any difference if you set the |
@swt2c No, that has no effect. Same issue also with socket_action. Might not be specifically related to remotely resolving domains - it's just that this is part of the first stage of a SOCKS connection. As far as I can see pycurl delegates SOCKS logic to libcurl anyway, but this issue could be caused by pycurl's GIL logic instead. It's possible libcurl is invoking some additional callbacks when you use a proxy since additional steps are necessary to set up such a connection. Such callbacks is where the problems with GIL logic can occur, since you get nested calls leading back to pycurl at a different spot where wrong assumptions can be made about GIL state. Note also that libcurl invokes callbacks mostly at the very start of a transfer, which is when this issue occurs. |
My bad, there was a regression regarding SOCKS proxies in libcurl itself, already fixed in 7.71.0: curl/curl#5710 (comment) Those fixes take care of multi.perform and multi.socket_action. (I'll also note they also fixed socket_action being a blocking call during SOCKS kickstart, so SOCKS transfers now play nice with async event loops.) However easy.perform() still blocks SIGINT when a SOCKS proxy is used. Not important for me personally, but I'll leave the issue open and change the title accordingly. |
Are you sure that it is pycurl doing that and not libcurl? |
@swt2c Doesn't seem so because of example 3, which doesn't block SIGINT. CLI curl uses When I have some spare time I'll rewrite example 1 in libcurl. |
Here's a libcurl example and its pycurl equivalent, setting a SOCKS proxy via CURLOPT_PROXY. The libcurl one terminates immediately on SIGINT. The pycurl one raises KeyboardInterrupt only after perform() returns. libcurl#include <curl/curl.h>
#include <stdio.h>
int main(void)
{
printf("%s\n", curl_version());
CURL *curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
curl_easy_setopt(curl, CURLOPT_URL, "http://jvgypgbnfyvfopg5msp6nwr2sl2fd6xmnguq35n7rfkw3yungjn2i4yd.onion/");
curl_easy_setopt(curl, CURLOPT_PROXY, "socks5h://127.0.0.1:9050");
/* doesn't block SIGINT */
curl_easy_perform(curl);
curl_easy_cleanup(curl);
return 0;
} pycurlimport pycurl
print(pycurl.version)
curl = pycurl.Curl()
curl.setopt(pycurl.VERBOSE, 1)
curl.setopt(pycurl.URL, 'http://jvgypgbnfyvfopg5msp6nwr2sl2fd6xmnguq35n7rfkw3yungjn2i4yd.onion/')
curl.setopt(pycurl.PROXY, 'socks5h://127.0.0.1:9050')
# blocks SIGINT
curl.perform()
curl.close() I've seen this happen only when a SOCKS proxy is set via pycurl. Some other means of proxifying the same Python script doesn't have this issue, for example when removing the I've also tested with different SSL libraries (openssl, gnutls, nss) when building pycurl and the above libcurl example, and it made no difference. |
If the wait is inside libcurl then I can suggest experimenting with the NOSIGNAL option and trying a non-blocking dns resolver (c-ares/threaded?). |
NOSIGNAL and non-blocking resolver make no difference on my end. I'm double-checking the presence of async resolver with: print('ASYNCHDNS:', pycurl.version_info()[4] & pycurl.VERSION_ASYNCHDNS) I also tried removing whatever setopts pycurl does internally as well as the GIL code in do_curl_perform() (BEGIN/END_ALLOW_THREADS), and SIGINT still doesn't interrupt curl_easy_perform(). I can't see any other spot that could cause this issue. Could someone else test this with a SOCKS proxy? |
I suggest trying a modern libcurl version where the SOCKS connect procedure has been remade to be totally non-blocking. |
@bagder |
Curl.perform() and CurlMulti.perform() can't be interrupted during DNS stage when a "socks5h://" proxy is set - i.e. when domain name is remotely resolved.
At the same time it is possible to interrupt in the following cases:
You can try running the following examples, but you should hit Ctrl+C immediately to see the difference.
Example 1: pycurl.PROXY [not interruptible]
Example 2: torsocks + pycurl [interruptible]
Same as above, but without the
c.setopt(pycurl.PROXY, PROXY)
line. Run with torsocks:This is interruptible, probably because pycurl isn't communicating to the socks proxy on its own, instead that is delegated to the torsocks wrapper without pycurl knowing anything about it. So the above issue is probably located in how pycurl handles proxies.
Example 3: curl --proxy [interruptible]
This is the same as the first example, but with curl instead of pycurl. It is interruptible like expected, so the issue doesn't go as deep as the libcurl level - it is pycurl-specific.
Versions
PycURL/7.44.1 libcurl/7.68.0 GnuTLS/3.6.13 zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3
Python 3.8.10
The text was updated successfully, but these errors were encountered: