Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunked requests still don't really work #1149

Closed
sorenisanerd opened this issue Jul 16, 2017 · 4 comments · Fixed by #1198
Closed

Chunked requests still don't really work #1149

sorenisanerd opened this issue Jul 16, 2017 · 4 comments · Fixed by #1198
Labels

Comments

@sorenisanerd
Copy link

sorenisanerd commented Jul 16, 2017

(Updated, because I realised I had been attempting to fix this, so got confused by the original issue and the issue that made me give up)

I'm making a chunked request to an app that uses Werkzeug (current git master). get_input_stream ends up calling LimitedStream(wsgi.input, min(content_length, max_content_length)). min(content_length, max_content_length) will always be None if using chunked encoding.

In an attempt to make progress I tried using LimitedStream(stream, content_length or max_content_length), but then I end up getting a ClientDisconnected exception, because we can't tell the difference between a client disconnecting and the chunked request simply being done, as wsgi.input.read() just returns a 0-length string in both cases.

@davidism
Copy link
Member

This is more than just the simple bug with min. After fixing that, the problem is that doing wsgi.input.read(max_content_length) with Werkzeug blocks forever if there's less than max_content_length to read. Or using Gunicorn, which supports chunks by parsing and buffering the whole message, the length comes out to less than max_content_length, so LimitedStream thinks the client disconnected.

I tested this out on Django just to be sure we're not doing something weird, it sees an empty stream too.

@davidism
Copy link
Member

The simplest way forward is to provide a middleware that sets environ['wsgi.input_terminated'] and tell people to use it when using a server that supports chunked transfer.

@djetelina
Copy link

Which wsgi currently supports input_terminated? After burning all day on debugging why the data is empty, then why stream is empty then realizing it's due to chunked stream encoding I now can't find anything that supports this :/

@javabrett
Copy link

In addition to checking environ.get('wsgi.input_terminated') should we also check environ.get('HTTP_TRANSFER_ENCODING') == 'chunked' and in that case also return the unwrapped stream directly? That fixes problems like a simple curl -H 'Transfer-Encoding: chunked' ... returning an empty stream.

Per RFC2616

4.4.2
If a Transfer-Encoding header field (section 14.41) is present and
has any value other than "identity", then the transfer-length is
defined by use of the "chunked" transfer-coding (section 3.6),
unless the message is terminated by closing the connection.

4.4.3
If a Content-Length header field (section 14.13) is present, its
decimal value in OCTETs represents both the entity-length and the
transfer-length. The Content-Length header field MUST NOT be sent
if these two lengths are different (i.e., if a Transfer-Encoding
header field is present). If a message is received with both a
Transfer-Encoding header field and a Content-Length header field,
the latter MUST be ignored.

... noting that Transfer-Encoding of chunked demands that any Content-Length header be ignored, and so no stream-processing should be done based on it. This is currently the case unless wsgi.input_terminated is set.

Should it really be necessary to set wsgi.input_terminated in order for normal chunked encoding e.g. of a 10 byte JSON payload to work?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants