Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why default to simplejson? #3052

Closed
digitaldavenyc opened this issue Mar 15, 2016 · 39 comments
Closed

Why default to simplejson? #3052

digitaldavenyc opened this issue Mar 15, 2016 · 39 comments

Comments

@digitaldavenyc
Copy link

I've recently run into an issue where another library installed simple-json and because the requests library defaults to it if available, it caused all of our json requests to fail due to a decoding problem.

I'm unclear why requests even defaults to simple-json anymore. I'd be happy to contribute to a PR to make the json library for requests more controllable but wanted to submit an issue first. Or perhaps there is another way I'm unaware of that would allow more control over which json library requests will use.

@Lukasa
Copy link
Member

Lukasa commented Mar 15, 2016

Thanks for this report! Please see issue #2516 for the last time this was discussed.

@Lukasa Lukasa closed this as completed Mar 15, 2016
@Lukasa
Copy link
Member

Lukasa commented Mar 15, 2016

So to be clear, I am now as I was then open to removing simplejson as an option entirely, but @kennethreitz wanted it there. =)

@digitaldavenyc
Copy link
Author

I'm not saying it should be removed ... but perhaps there is a better way to control the import. It's possible for libraries to conflict with each other. Could there be a way to specify the json library to requests?

@kennethreitz
Copy link
Contributor

Historically, I was under the impression that a freshly-compiled simplejson has better performance characteristics than the json module that is built into Python 2.6.

@kennethreitz
Copy link
Contributor

I believe the "speed-ups" weren't enabled in the json module until 2.7.

Recalling from memory here, so I may not be correct.

@Lukasa
Copy link
Member

Lukasa commented Mar 15, 2016

Controlling imports is extremely fraught because it relies on delaying the import until some later time in the execution. I am pretty averse to doing that because it causes bugs to appear at weird places and can generally make life weirdly tricky. If we're going to keep it we should keep it, if we're going to remove it we should remove it.

@kennethreitz
Copy link
Contributor

The last three comments here state our stance nicely https://github.com/kennethreitz/requests/pull/2516#issuecomment-89005463

@digitaldavenyc
Copy link
Author

So what would you suggest to do when two libraries conflict with each other? Monkey patch the requests library to control which json import is used?

@kennethreitz
Copy link
Contributor

Conflict with eachother? I don't understand. If simplejson doesn't function properly on your system, then you probably find out why, or uninstall it.

@kennethreitz
Copy link
Contributor

I'm not against removing if it's really a problem for some users, as I don't think it would impact anyone in any measurable way, but it's currently one of those nice final touches that makes using Requests a pleasure — an extreme level of thoughtfulness.

Five years ago, people cared a lot more about the speed of their encoding/decoding serialization libraries than they seem to today. This was best practice. I don't think it's harmful either.

@kennethreitz
Copy link
Contributor

It should also be noted that 2.6 is being used far less today than it was five years ago.

@kennethreitz
Copy link
Contributor

@digitaldavenyc the only acceptable way to control that import flow would be to support an environment variable e.g. REQUESTS_NO_SIMPLEJSON. That is not going to happen.

@digitaldavenyc
Copy link
Author

Yes I'll give my real world scenario ... One library is using requests and after some digging found that it was overriding json decode for various reasons. It works great with requests when using the standard python json library but another python library decided to import simplejson and broke decoding in requests. You could say that to real issue was the override but I think the lack of control over which json library requests use is an issue.

@kennethreitz
Copy link
Contributor

It should also be noted that our json support is very simple, and in no way necessary for using the library — you can easily json.loads(response.content) (although we do a little bit more with encodings, I believe).

@digitaldavenyc
Copy link
Author

Does simplejson actually have any performance gains over native json in newer versions of Python 3.3+ ?

@kennethreitz
Copy link
Contributor

I have no idea, I doubt it. I specifically said Python 2.6.

@sigmavirus24
Copy link
Contributor

Or json.loads(response.text). Yes it's nice to use a convenience method, but requests has lots of convenience methods that people bypass in certain cases and this is a great example of one case (where you have another requirement on simplejson which should work exactly identically to the standard library's json module).

Questions about simplejson's performance are not relevant to requests though.

@digitaldavenyc
Copy link
Author

I was about to say maybe there is no reason to change anything in requests but if there are no performance related reasons to use simplejson, then why bother including it in requests anymore?

@kennethreitz
Copy link
Contributor

Have you read any of the comments I made earlier?

@kennethreitz
Copy link
Contributor

Performance related reasons == Python 2.6.x.

@digitaldavenyc
Copy link
Author

@kennethreitz Exactly. You just mentioned that not many people use Python 2.6 anymore. Is it still a nice to have feature that's worth including?

@kennethreitz
Copy link
Contributor

Maybe, maybe not. That's what my statements above were postulating.

@kennethreitz
Copy link
Contributor

@digitaldavenyc
Copy link
Author

I agree with you that including simple-json support was great 3-4 years ago, it did show that level of thoughtfulness that makes requests a great library. But I am failing to see the value of including simple-json moving forward in future releases. If only applications using Python 2.6 can get any benefits from using simple-json, it seems pointless to include any longer since that is the lowest version of Python currently supported by requests.

I am just a user of requests but you guys maintain the library and is for you to decide.

@kennethreitz
Copy link
Contributor

I agree. A few options here:

  1. Keeping things as they are (always preferred)
  2. Removing simplejson logic completely (I think this would be fine)
  3. Limiting simplejson logic to 2.6 only (unideal, but would limit potential side-effects)

I like 1 the best, with a "if it ain't broke, don't fix it!" mentality, very loosely held. I know 2 is what @sigmavirus24 and @Lukasa prefer, and if it's worth the (minimal) effort, I'm not against it if they're for it.

@digitaldavenyc
Copy link
Author

I would think 3 would probably be the best option since Python 2.6 users may still be using simple-json and it is still supported by requests as of now. Could there potentially be any issues with that?

@kennethreitz
Copy link
Contributor

I don't think anyone will either notice or care. I put it in for them because I cared more than they did :)

@kennethreitz
Copy link
Contributor

If the community really cared about speed, we'd all be using ujson all the time. It's about convenience nowadays, and sticking to the crowd, above most else.

@kennethreitz
Copy link
Contributor

Found on the internet:

With simplejson (when using speedups module), string's type depends on its value, i.e. it decodes strings as 'str' type if their characters are ascii-only, but as 'unicode' otherwise; strangely enough, it always decodes strings as unicode (like standard json module) when speedups are disabled.

Rookie mistake, although quite common practice pre-python3. Requests did this for years for Response.content before I started working on Python 3 support and was forced to make the distinct Response.text property for unicode (much better design).

I suspect this is the issue you're running into.

Given this information, 2 it will be: removing simplejson logic completely.

@digitaldavenyc
Copy link
Author

2 is probably the best route to take in the long term.

@digitaldavenyc
Copy link
Author

Well I did appreciate it when you did include simple-json back in 2010 if that's any consolation :-)

@kennethreitz kennethreitz reopened this Mar 16, 2016
@digitaldavenyc
Copy link
Author

Would it make the most sense to re-open the original PR https://github.com/kennethreitz/requests/pull/2516 ?

@Lukasa
Copy link
Member

Lukasa commented Mar 16, 2016

No. #2516 bizarrely added code to fall back to simplejson, which could never possibly be hit (all Python versions have a json module that successfully imports.

@digitaldavenyc
Copy link
Author

Right that doesn't make sense. New PR then?

@Lukasa
Copy link
Member

Lukasa commented Mar 16, 2016

Yup. =)

@sigmavirus24
Copy link
Contributor

With #3056 merged I'm closing this.

@Bhanuprakash111
Copy link

I agree. A few options here:

  1. Keeping things as they are (always preferred)
  2. Removing simplejson logic completely (I think this would be fine)
  3. Limiting simplejson logic to 2.6 only (unideal, but would limit potential side-effects)

I like 1 the best, with a "if it ain't broke, don't fix it!" mentality, very loosely held. I know 2 is what @sigmavirus24 and @Lukasa prefer, and if it's worth the (minimal) effort, I'm not against it if they're for it.

2 is probably the best route to take in the long term.

I have alos the same issue, but I am unable to remove the simplejson using " pip3 uninstall simplejson ", If it is not the way to remove it, please tell me how to remove it

@kennethreitz
Copy link
Contributor

kennethreitz commented Nov 15, 2019 via email

@kennethreitz
Copy link
Contributor

therefore, it is used firstmost if it is available, as it is likely faster for the user if they are using the wheel, or have the c extensions available. deserialization is a very slow operation, computationally, especially if you are doing hundreds of thousands of requests. anything we can do to help optimize this, we should be doing.

Removing this wouldn't be api-incompatible, but it would require a new release of requests, which should only contain security fixes. Users may be relying on this functionality, and it is causing no harm to the codebase.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants