Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webob 1.7 compat: provide a charset for json responses #2971

Closed
wants to merge 1 commit into from
Closed

webob 1.7 compat: provide a charset for json responses #2971

wants to merge 1 commit into from

Conversation

asottile
Copy link

@asottile asottile commented Mar 3, 2017

Hitting this at work while trying to upgrade webob -- we're using this hackety workaround for now:

def charset_json_renderer(info):
    json_renderer = renderers.JSON()(info)

    def render_json(value, system):
        request = system['request']
        ret = json_renderer(value, system)
        request.response.charset = 'utf-8'
        return ret
    return render_json


def setup_app(cfg):
    # ...
    cfg.add_renderer(None, charset_json_renderer)

Demonstration of failing test:

with test, webob<1.7

$ pip install 'webob<1.7.0'
Collecting webob<1.7.0
  Downloading WebOb-1.6.3-py2.py3-none-any.whl (78kB)
    100% |████████████████████████████████| 81kB 2.9MB/s 
Installing collected packages: webob
  Found existing installation: WebOb 1.7.0
    Uninstalling WebOb-1.7.0:
      Successfully uninstalled WebOb-1.7.0
Successfully installed webob-1.6.3
$ py.test pyramid/tests/test_renderers.py 
============================= test session starts ==============================
platform linux2 -- Python 2.7.12, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /home/asottile/workspace/pyramid, inifile: 
collected 60 items 

pyramid/tests/test_renderers.py ............................................................

========================== 60 passed in 0.32 seconds ===========================

with test, webob==1.7.1

$ pip install webob --upgrade
Collecting webob
  Using cached WebOb-1.7.1-py2.py3-none-any.whl
Installing collected packages: webob
  Found existing installation: WebOb 1.6.3
    Uninstalling WebOb-1.6.3:
      Successfully uninstalled WebOb-1.6.3
Successfully installed webob-1.7.1
$ py.test pyramid/tests/test_renderers.py 
============================= test session starts ==============================
platform linux2 -- Python 2.7.12, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /home/asottile/workspace/pyramid, inifile: 
collected 60 items 

pyramid/tests/test_renderers.py ......F.....................................................

=================================== FAILURES ===================================
________________ TestJSON.test_with_request_content_type_notset ________________

self = <pyramid.tests.test_renderers.TestJSON testMethod=test_with_request_content_type_notset>

    def test_with_request_content_type_notset(self):
        request = testing.DummyRequest()
        renderer = self._makeOne()(None)
        renderer({'a':1}, {'request':request})
        self.assertEqual(request.response.content_type, 'application/json')
>       self.assertEqual(request.response.charset, 'utf-8')
E       AssertionError: None != 'utf-8'

pyramid/tests/test_renderers.py:28: AssertionError
===================== 1 failed, 59 passed in 0.34 seconds ======================

After the patch, the tests passes in both versions

@mmerickel
Copy link
Member

AFAIK dropping the charset off of a non-text content-type was the main goal of webob 1.7's changes. A JSON response is not supposed to have an explicit charset. I think the first question is -
why are you expecting response.charset == 'utf-8'? Why does your code need this?

@asottile
Copy link
Author

asottile commented Mar 3, 2017

We deal with non-ASCII-non-latin1 json so a charset is necessary for the client to properly decode the bytes it receives on the wire.

A quick google confirms other frameworks do / require the same thing: https://www.google.com/search?client=ubuntu&channel=fs&q=json+response+encoding&ie=utf-8&oe=utf-8

The specific bit of code that's very angry with no charset: https://github.com/striglia/pyramid_swagger/blob/e89cf43fc91d382d9255669e0cd9f12543988c8b/pyramid_swagger/tween.py#L545-L549

@asottile
Copy link
Author

asottile commented Mar 3, 2017

JSON is text (not a binary response) so I think it should have a charset

@mmerickel
Copy link
Member

We deal with non-ASCII-non-latin1 json so a charset is necessary for the client to properly decode the bytes it receives on the wire.

For non-UTF* json it's certainly important to force a charset, but we're talking about utf8 here which the mimetype for application/json is pretty clear about [1]. I posit that the code in your pyramid_swagger link is incorrect here.

Some other discussions [2] [3] cover this fact as well.

[1] https://tools.ietf.org/html/rfc4627#section-3
[2] request/request#383 (comment)
[3] pallets/werkzeug#847

@asottile
Copy link
Author

asottile commented Mar 3, 2017

The exception guards against this trace (webob<1.7) -- the code is compatible with webob 1.4 - webob 1.7:

>>> webob.response.Response(b'{"foo": "bar"}', content_type='application/json').text
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/venv/lib/python3.6/site-packages/webob/response.py", line 420, in _text__get
    "You cannot access Response.text unless charset is set")
AttributeError: You cannot access Response.text unless charset is set

@mmerickel
Copy link
Member

I personally would expect the Response.default_body_encoding to deal with that when the charset is not defined, but that is something for @bertjwregeer to comment on as he was in the nitty gritty there.

@asottile
Copy link
Author

asottile commented Mar 3, 2017

It does, but that's only present in webob 1.7+ so we're in a bit of a pickle of not being able to roll forward or back without a charset in the response.

@mmerickel
Copy link
Member

It does, but that's only present in webob 1.7+ so we're in a bit of a pickle of not being able to roll forward or back without a charset in the response.

Sorry I've read this issue several times now and I'm still confused what you mean here. I think you're saying that pyramid_swagger is struggling to perform a check that will work with both webob 1.6 and 1.7.

It seems to me that you should just stop checking for the charset because in both versions response.text should "just work" but for different reasons. In webob 1.6 it would use the default charset of utf8 and in 1.7 it would use the default_body_encoding.

@digitalresistor
Copy link
Member

We deal with non-ASCII-non-latin1 json so a charset is necessary for the client to properly decode the bytes it receives on the wire.

You are not dealing with JSON at this point. JSON itself as defined as either UTF-8/16/32 (and the latter two only with a BOM). You'll want to use something other than application/json if this is what you are putting on the wire (like application/octet-stream; charset=whatever).

Setting a charset will work, and WebOb will dutifully set the Content-Type to application/json; charset=UTF-8 but this is not valid according to the json specification, nor is application/json; charset=ASCII for example.

Your tests shouldn't be checking for a charset on JSON responses (because it doesn't exist), and always force them to be valid JSON by making sure they at valid UTF-8 or UTF-16/32 with BOM.

WebOb 1.7 will do it's best to not send a charset content-type parameter if the content-type shouldn't have one. This change would undo that work and have Pyramid send a response that is invalid, please fix your code that is causing the failure.

@digitalresistor
Copy link
Member

Issue created for pyramid_swagger: https://github.com/striglia/pyramid_swagger/issues/185

@tseaver
Copy link
Member

tseaver commented Mar 4, 2017

My understanding, based on RFC 2045 / 2616, is that the charset parameter is only really defined for text/* types: the point of application/* types is that they aren't meant to be treated text, but are bytes in a format defined by the given application (and JSON mandates UTF-8).

See the application/json [IANA registration](http://www.iana.org/assignments/media-types/application/json:

Note: No "charset" parameter is defined for this registration.
Adding one really has no effect on compliant recipients.

@asottile
Copy link
Author

asottile commented Mar 6, 2017

Understandable.

I'm in a bit of a pickle with upgrading however as we're running a SOA environment:

  • Current state: all services running webob<1.6 (which raises on .text without charset)
  • Desired state: all services running webob>=1.7 (which allows .text for content-type of application/json without charset)
  • There's an (unfortunate) monolith which has both upstream and downstream connection with nearly every service (a cycle).
  • Upgrading any service to webob1.7 will cause it to no longer send charset=utf-8 with responses and will break any downstream consumers which are using webob<1.7

As you can see, this seemingly harmless backward incompatibility is really wedging my plans to upgrade :(

I initially opened this in that it was a quick fix that seemed to be a regression, though after consideration it seems that this change is intentional.

Thanks for the comments, I'll bang my head against this some more and see if I can't find an upgrade pathway that's possible at all -- feel free to close this PR :)

@mmerickel
Copy link
Member

I'm not sure I see the pickle here. The only issue seems to be that your downstream consumers were implemented incorrectly to assume a charset would be there. Webob has attempted to fix this a few times in its history. One change of note is https://github.com/Pylons/webob/blob/8669fe335b54697f23a787f67da19552bc03d5c6/HISTORY.txt#L122 which was put into 1.6 but apparently in such a way that it didn't affect you (lucky!).

I mean I guess you either keep using your custom renderer to ensure the charset is always there, or you fix your consumers to be compliant with the RFC. Probably a little of a and a little of b until you decide you can drop your shim.

@mmerickel mmerickel closed this Mar 6, 2017
@asottile
Copy link
Author

asottile commented Mar 6, 2017

The pickle is that old webob and new webob unfortunately. Most of our stuff is super old (1.4) and upgrading any one of them breaks downstreams (which leads to circular brokenness).

It's fine though, I'll figure it out.

@asottile asottile deleted the webob_1_7_compat branch March 6, 2017 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants