New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeEncodeError: 'latin-1' codec can't encode characters #1822
Comments
Did you write |
I used requests in sublime plugin, if the soap_body in below statement didn't contains any Chinese characters, there will be no exception.
|
Firstly, unless you're using a different version of Sublime Apex to the one in their public repository, Requests is not the latest version, it's version 1.2.3. Can you tell me what version of Sublime Text you're using? |
It's sublime text 3056 |
So, ST 3, but not the most recent revision. Ok, that gives us something. Specifically, Sublime Text 3 uses Python 3.3, not Python 2.7 (which Sublime Text 2 used). This means all the default strings in Sublime Apex are unicode strings. If you open up the Python 3.3 # Honor explicitly requested Host: and Accept-Encoding: headers.
header_names = dict.fromkeys([k.lower() for k in headers])
skips = {}
if 'host' in header_names:
skips['skip_host'] = 1
if 'accept-encoding' in header_names:
skips['skip_accept_encoding'] = 1
self.putrequest(method, url, **skips)
if body is not None and ('content-length' not in header_names):
self._set_content_length(body)
for hdr, value in headers.items():
self.putheader(hdr, value)
if isinstance(body, str):
# RFC 2616 Section 3.7.1 says that text default has a
# default charset of iso-8859-1.
body = body.encode('iso-8859-1')
self.endheaders(body) Now, ISO-8859-1 is an alias for Latin-1, which is the codec we're having trouble with. The problem we've got is that Sublime Apex is providing a unicode string body to Requests, which httplib needs to encode into bytes. Taking the default from RFC 2616, it concludes you want Latin-1, which doesn't include any Chinese characters. Clearly then, encoding fails, and you get the exception in question. Considering that Sublime Apex claims in the headers it sends to be sending UTF-8 encoded data (which is a lie currently), Sublime Apex wants to be encoding the data as UTF-8 before sending it. This means any line sending data (in this case line 545 of response = requests.post(self.apex_url, soap_body.encode('utf-8'), verify=False, headers=headers) For the sake of anyone else who wants to confirm my diagnosis, here's a quick bit of sample code that confirms the problem: a = "\u13E0\u19E0\u1320"
a.encode('latin1') # Throws UnicodeEncodeError, proves that this can't be expressed in ISO-8859-1.
a.encode('utf-8') # Totally fine.
r = requests.post('http://httpbin.org/post', data=a) # Using unicode string, throws UnicodeEncodeError blaming Latin1.
r = requests.post('http://httpbin.org/post', data=a.encode('utf-8')) # Works fine. Thanks for raising this with us, but this is not a Requests bug. =) |
Thanks. |
r = requests.post('http://httpbin.org/post', data=a.encode('utf-8')) |
Requests is the latest version.
When I try to post the data which contains Chinese character, this exception is thrown.
The text was updated successfully, but these errors were encountered: