Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requests leaves a hanging connection for each request #520

Closed
Bluehorn opened this issue Mar 29, 2012 · 62 comments
Closed

requests leaves a hanging connection for each request #520

Bluehorn opened this issue Mar 29, 2012 · 62 comments
Labels

Comments

@Bluehorn
Copy link

This was already noted in #458, but deserves an extra ticket: When using the simple API (via requests.head/get/post etc.), the requests module leaves a connection open until it is garbage collected. This eats up resources on the server site and it really impolite.

How to reproduce:

Run this script (can somebody please tell me how one can embed a gist in markdown? The help page shows only the result of embedding...):

https://gist.github.com/2234274

Result on my system:

torsten@sharokan:~/workspace/loco2-git$ python demo.py
Open sockets after 20 head requests:
COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
python  13928 torsten    3u  IPv4 616968      0t0  TCP sharokan.fritz.box:50072->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    4u  IPv4 616972      0t0  TCP sharokan.fritz.box:50073->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    5u  IPv4 616976      0t0  TCP sharokan.fritz.box:50074->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    6u  IPv4 616980      0t0  TCP sharokan.fritz.box:50075->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    7u  IPv4 616984      0t0  TCP sharokan.fritz.box:50076->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    8u  IPv4 616988      0t0  TCP sharokan.fritz.box:50077->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten    9u  IPv4 616992      0t0  TCP sharokan.fritz.box:50078->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   10u  IPv4 616996      0t0  TCP sharokan.fritz.box:50079->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   11u  IPv4 617000      0t0  TCP sharokan.fritz.box:50080->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   12u  IPv4 617004      0t0  TCP sharokan.fritz.box:50081->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   13u  IPv4 617008      0t0  TCP sharokan.fritz.box:50082->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   14u  IPv4 617012      0t0  TCP sharokan.fritz.box:50083->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   15u  IPv4 617016      0t0  TCP sharokan.fritz.box:50084->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   16u  IPv4 617020      0t0  TCP sharokan.fritz.box:50085->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   17u  IPv4 617024      0t0  TCP sharokan.fritz.box:50086->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   18u  IPv4 617028      0t0  TCP sharokan.fritz.box:50087->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   19u  IPv4 617032      0t0  TCP sharokan.fritz.box:50088->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   20u  IPv4 617036      0t0  TCP sharokan.fritz.box:50089->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   21u  IPv4 617040      0t0  TCP sharokan.fritz.box:50090->bk-in-f94.1e100.net:www (ESTABLISHED)
python  13928 torsten   22u  IPv4 617044      0t0  TCP sharokan.fritz.box:50091->bk-in-f94.1e100.net:www (ESTABLISHED)
Garbage collection result: 1700
Open sockets after garbage collection:

I think the sockets should be closed immediately when the session is unrefed. FYI, I found this when testing a local webservice by running it in CherryPy which defaults to 10 threads - after 10 requests, the web server was blocked.

@Bluehorn
Copy link
Author

Digging further, this seems due to a reference loop and is really a problem inherited from httplib: The HTTPResponse keeps a reference to the file object as _fileobject. I would argue that that reference should be dropped after the complete response was read (it should not be possible to read from the socket of a pooled connection using some old response).

@Bluehorn
Copy link
Author

Okay, and the reason this shows up is because there is a reference cycle between the request and response object.
Otherwise, when the response object is dropped, the file would be closed. Due to the ref cycle, this can only happen on the next garbage collection.

@piotr-dobrogost
Copy link

Is the the same issue as in #239?

@kennethreitz
Copy link
Contributor

@shazow any thoughts?

@shazow
Copy link
Contributor

shazow commented Mar 31, 2012

Are we talking about urllib3 request/response objects, or requests request/response objects?

urllib3 cleans up in the response.release_conn method. The connection gets None'd for the exact reason that @Bluehorn mentioned. We could also None the self._fp but that's more for custom interfaces; I don't think httplib uses it.

We can definitely be more aggressive with closing connections in the cleanup of Requests's context manager. Right now to manually close connections, you'd need to iterate for each pool and pop its queue of connections and close them if they're open. If that's what we want to do, then I can definitely add a helper in urllib3 for "closing pools" which will do that in one call.

@gsakkis
Copy link

gsakkis commented Apr 25, 2012

Another approach would be to break the hard cycle between response and request by changing req.response to a weakref proxy before returning it from the session: http://dpaste.org/GOqzA/.

@slingamn
Copy link
Contributor

slingamn commented May 4, 2012

@Bluehorn's test continues to reproduce in trunk for me. Thanks!

Here's something else interesting. If I change the test like so:

diff --git a/conntest.py b/conntest.py
index 3eac6c7..ec30d3f 100644
--- a/conntest.py
+++ b/conntest.py
@@ -2,8 +2,10 @@ import gc, os, subprocess, requests

 gc.disable()         # Just to make sure - I have seen this with gc enabled

+s = requests.session()
+
 for x in range(20):
-    requests.head("http://www.google.com/")
+    s.head("http://www.google.com/")

I get the same result, 20 leaked ESTABLISHED connections. Are we sure that keep-alive is working?

@slingamn
Copy link
Contributor

slingamn commented May 4, 2012

Sorry, documentation-reading fail. I wasn't consuming the content of the response objects. If I do that, things work as expected, connections get reused, and in the end only one is left alive.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

Here's my understanding of this issue so far:

  1. @Bluehorn and @gsakkis are right: the root problem is the reference cycle between requests.Request and requests.Response.
  2. Breaking this cycle in a crude way, e.g., removing the request member of the Response class, fixes @Bluehorn's test case.
  3. Although it might be possible to improve the way Requests uses urllib3, that can't fix the test case, because by default Requests returns the Response object with the socket still open and the body unread. Therefore, we rely on garbage collection to close the socket when clients don't read all the data. (Technically, since the test case is doing a HEAD request, the implementation of requests.head could be changed to close the socket. But the problem would still exist for the analogous test case calling requests.get.)

So here are the three viable approaches that occur to me:

  1. Replace Response.request with a weakref.proxy, as @gsakkis suggested. This breaks the following test: r = requests.get(httpbin('get)); assert r.request.sent (because r.request has been garbage-collected already, so we get "ReferenceError: weakly-referenced object no longer exists"). Anyway, this involves a smallish API breakage.
  2. Remove Response.request. This has the advantage of not using weak references. This breaks the above test and also breaks HTTPDigestAuth, which uses the response hook and expects the Response object it operates on to have a request member. We could work around this by changing the hook API, but either way, it seems like a larger API break.
  3. Enable prefetch and disable keep_alive by default, and then ensure that all the data is read and the socket is closed before we return to the client.

Random thought: it seems like keep-alive should be off by default. Outside the context of a session, we have no way to reuse the connection anyway.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

Heh, we could remove Response.request in general, but add it back in for the duration of the hook invocation. This makes #2 look more like #1, in terms of API breakage.

https://gist.github.com/2621434

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

In favor of option 3: any solution that relies on eager garbage collection won't work with PyPy.

@shazow
Copy link
Contributor

shazow commented May 6, 2012

Once upon a time prefetch was on by default, we went through significant lengths to fix that. :)

Also it's pretty rare that you'll only make 1 request to a host in a script, so having keep_alive on by default is a nice feature. Otherwise we'd be content using urllib{1,2} with a thin API wrapper.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

@shazow

Can you clarify the reasons for defaulting prefetch to off instead of on?

So --- defaulting keep_alive=True in the context of a session makes sense. But in the context of the vanilla API, that is to say, a plain requests.get or requests.post, there doesn't seem to be any mechanism for reusing the keep-alive connections, because an ephemeral Session object is created and it is given an ephemeral PoolManager in init_poolmanager().

@shazow
Copy link
Contributor

shazow commented May 6, 2012

Perhaps the original reason was that it made urllib3 and requests perform better in naive benchmarks. Upon further consideration, it also made sense to fetch only the data we're actually using, especially for large responses when we only really care about headers. Someday in the future when we add ETAG-based caching, this will be a necessity. Overall, lots of upsides, no real downsides (until now).

I'm not super familiar with the Requests code in this context, but I'd imagine we'd want a reused static PoolManager instance for these things.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

A static or module-level mechanism for sharing and reusing the connections would make sense. On the other hand, it would significantly worsen the current "gotcha" with leaked connections, because if a client doesn't call .content, those connections will remain reserved in the static pool forever --- not even GC will be able to reclaim them.

@shazow
Copy link
Contributor

shazow commented May 6, 2012

What if we made an explicit del call on the connection when the LRU invalidates the object?

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

Then we'd just leak connections up to the size of the LRU cache (10 by default?).

My vote would be to enable prefetch=True, keep keep_alive=True, and add a module-level connection pool for reusing connections. I don't think this would break a future implementation of ETAG caching; we could just check the ETAG before prefetching and hang up the phone if the data is cached.

@shazow
Copy link
Contributor

shazow commented May 6, 2012

Is an O(1) leak really a leak? ;)

I'm -0 on prefetch=True just because I feel the current settings cover the +80% use case, but I wouldn't shed any tears. Up to @kennethreitz, I'd say.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

In the real world, everything is a constant factor ;-)

@piotr-dobrogost
Copy link

Module-level connection pool doesn't sound like a good idea to me. Setting prefetch=True just because that's easier :) doesn't sound good, either. Just saying.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

@piotr-dobrogost if we don't prefetch, I don't see a good solution that works on all platforms, including those like PyPy that lack eager garbage collection.

For comparison, here's something that always infuriated me about urllib2:

try:
    fileobj = urllib2.urlopen(MY_URL)
    text = fileobj.read()
finally:
    fileobj.close()

Seems like the 80%, batteries-included thing is to have an API that reads the whole page in a fire-and-forget way.

@kennethreitz
Copy link
Contributor

A module-level connection pool? Absolutely not. That's what session is for.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

That makes sense, as long as we turn off keep_alive by default. I think:

  1. Default prefetch=True
  2. Default keep_alive=False (unless we're in an actual Session)
  3. Ensure the connections are being closed

This will resolve the connection leak. Optionally, we could also break the reference cycle, as a memory-usage optimization for CPython.

@kennethreitz
Copy link
Contributor

All requests are made within a session.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

Right. For "actual Session", read "non-ephemeral Session", i.e., "Session created by the client with the expectation of reuse."

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

But I guess this doesn't answer all the questions about expected behavior of Session objects. I think something like this:

  1. A non-ephemeral Session should send Connection: keep-alive by default, and retain connections in a pool
  2. On CPython, the Session object should not be involved in any reference cycles, and those connections should be disposed of when the object goes out of scope
  3. For PyPy etc., there should be a Session.dispose() and/or a context manager that manually closes the connections

@kennethreitz
Copy link
Contributor

It shouldn't matter if it's ephemeral or not. When the session dies (and the connections therein), so should the connection pool.

There should be no specific behavior for different platforms.

@slingamn
Copy link
Contributor

slingamn commented May 6, 2012

Well, we wouldn't special-case behavior inside Requests itself based on platforms. CPython would do the right thing whether or not the client called dispose() (which is what CPython users probably expect --- instantaneous scope-based GC). PyPy users would have to ensure that they called dispose().

As for why it would matter whether the session is ephemeral or not, @craigholm makes a good point on #458: a client that does not support persistent connections "must" send Connection: close. If we're creating the session just to destroy it, that's essentially not supporting persistent connections. (Technically, you can get the session as resp.request.session and reuse it, but if we get rid of the reference cycle, that would also go away.)

@slingamn
Copy link
Contributor

slingamn commented May 7, 2012

@Bluehorn that's fair. I think the test case points towards other gotchas that aren't fixed by doing a HEAD. For example:

  1. The client doing a POST and then reading only the status code
  2. The client doing a GET and then reading the content on a 200 but not on a 404.

@kennethreitz
Copy link
Contributor

Essentially, some servers don't support HEAD requests. That's why prefetch isn't default.

@slingamn
Copy link
Contributor

slingamn commented May 7, 2012

Intuitively, a property access is not supposed to have side effects; @property should transparently abstract away the computation of a "derived member", or transparently implement lazy evaluation. Closing a socket is quite a serious side effect.

Changing a default or creating even a small API break would definitely be unfortunate, but I'm having trouble seeing another way to protect the client from socket leaks.

@kennethreitz
Copy link
Contributor

A user of Requests shouldn't be even remotely concerned with sockets. The solution's quite simple — don't leave references to sockets/sessions that aren't needed.

@slingamn
Copy link
Contributor

slingamn commented May 8, 2012

I agree completely: no one should have to think about sockets. But if we don't prefetch, then a reference to the open socket must be retained --- whether the Response object or the Session object retains it is just an implementation detail, either way the socket has to be retained so the client can read the page body from it.

So when do we clean up this reference? The possible answers seem to be:

  1. When the Response goes out of scope. This does the right thing on CPython, but not on PyPy.
  2. When the user accesses Response.content or Response.text. This is nontransparent, and the user has to be aware of the side effect.

@kennethreitz
Copy link
Contributor

Let's do both. There's no side effect worth consideration, because that's all handled by connectionpool.

@Bluehorn
Copy link
Author

Bluehorn commented May 8, 2012

BTW: I habe no problem if not reading the reply keeps the connection alive as long as this does not apply to head requests.
My original code actually does POST requests and reads the replies in full, I only reduced it to head requests so that the code is minimal. I am using requests to drive a SOAP service. Originally I did not use sessions and introducing them worked around the problem for me (as a side effect, I need sessions in any case to optimize away https connects per request).

slingamn added a commit to slingamn/requests that referenced this issue May 15, 2012
slingamn added a commit to slingamn/requests that referenced this issue Aug 6, 2012
* prefetch now defaults to True, ensuring that by default, sockets
  are returned to the urllib3 connection pool on request end
* sessions now have a close() method, notifying urllib3 to close pooled
  connections
* the module-level API, e.g., `requests.get('http://www.google.com')`,
  explicitly closes its session when finished

When prefetch is False, the open socket becomes part of the state of the
Response object, and it's the client's responsibility to read the whole
body, at which point the socket will be returned to the pool.
kennethreitz pushed a commit that referenced this issue Aug 6, 2012
address connection leak issue from #520
@kennethreitz
Copy link
Contributor

Fixed!

@rca
Copy link

rca commented Oct 11, 2013

I'm running into this with requests==2.0.0. The code is using a timeout, might that be causing a problem?

@rca
Copy link

rca commented Oct 11, 2013

I believe I have confirmed this to be a problem with sessions + timeout. When using the etcd-py package, a new connection is created any time this call times out; no dangling connection when the request returns data:

stuff = etcd.watch('message', timeout=30.0)

@Lukasa
Copy link
Member

Lukasa commented Oct 12, 2013

I believe this problem is in urllib3, and will be incidentally fixed by urllib3/urllib3#252.

@rca
Copy link

rca commented Oct 15, 2013

@Lukasa thanks for the tip. After more research I am pretty sure this is a problem with the server side code in etcd.

Thanks!

@mouadino
Copy link

The fix from this issue was reverted here kennethreitz@92d5703 ? Any reason why ? I am specially talking about the changes in requests/api.py.

And can some one please explain what is the guid line when using ephemeral session like what most requests/api.py shortcuts does ? A new session mean a new connection pool which really doesn't make any sense if you want to just do a requests.get(...) and especially if the connections (it only one after all) in the pool are not close afterward.

@Lukasa
Copy link
Member

Lukasa commented Feb 24, 2014

@mouadino The commit you're talking about was the rewrite that took place for V1.0.0. This was basically a total rewrite of the library, so the fix was not 'reverted', it was just no longer necessary.

I'm not entirely clear on what you're asking in the last part, but you've got a substantial misunderstanding of how Requests behaves. When we create a connection pool we don't open all of those connections at once: that would be ridiculous, and HTTP doesn't work that way anyway. We open one connection when you make your first request, and then reuse it as much as possible.

There is an inefficiency when using requests.get() and friends, but that inefficiency is acceptable to get that clean API. People who don't want the inefficiency can simply use a Session object as described in the documentation.

@mouadino
Copy link

@Lukasa Thanks for the quick response, and you miss understood my question, sorry for not being clear, what i meant is actually what is described in this bug report.

If i have in my script a call like this request.get(....), this call will create a new Session object which create a new connection pool, and from this later a connection will be created/borrowed to send my GET request, this connection will not be closed by the server (b/c of keep-alive, assuming that the server understand it), so we got now one connection in the pool that is still open and ready to be re-used right ?

But how about if i do another requests.get(...)This will also create another session which mean another connection pool and like before the result will be one connection from this new pool that is still open and ready to be re-use, but as you can see they will not be re-used because in the end the session object is not re-used.

So the result now is that we have 2 connection open to our server which will not be close until my program stop (clean up by the kernel !?) or the server clean them up (if it's support a keep alive timeout), so you can imagine what will happen if i have a lot of requests.get(...) calls (as an example).

First of all does this make sense ? If so what what is the guide line of using ephemeral session ? if this laters will not clean up after them self, unless i call session.close(...) (which will close opened connection from the pool), and if the close call is needed why wasn't it part of the api shortcuts ? And wouldn't it make more sense to disable connection pooling in a use case like this one ?

@Lukasa
Copy link
Member

Lukasa commented Feb 24, 2014

Ah, that makes more sense. =)

The answer is no, you won't leak connections. This is because eventually the Session, the Connection Pool it contains and all of its active connections will get garbage collected, and when they do any open connections are closed by the garbage collector.

Of course, this doesn't matter because the server will close them well beforehand. Most HTTP servers don't keep the connection alive for more than a few seconds because it's a severe resource overhead for that server.

The guide for using ephemeral Sessions is simple: use them if you don't care about efficiency. If you use them you're accepting that you don't care whether your resources are cleaned up right away or not. If you believe that cleaning up after yourself is important then you should use an explicit Session. =)

@mouadino
Copy link

Aha, because underneath they are actually socket which are actually closed automatically when they are GC, but relaying on the GC for my use case is not really good, and the worse is that i am in presence of a http server that doesn't clean up opened connections (or maybe i didn't figure out yet how to do it), but i can see that this in not something that requests can fix :=)

Thanks again for you response :)

@pratikshapaudyal9
Copy link

What about for a streaming requests session. How to ensure this is closed?

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests