Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError: 'latin-1' codec can't encode characters #1822

Closed
xjsender opened this issue Dec 20, 2013 · 7 comments
Closed

UnicodeEncodeError: 'latin-1' codec can't encode characters #1822

xjsender opened this issue Dec 20, 2013 · 7 comments

Comments

@xjsender
Copy link

Requests is the latest version.
When I try to post the data which contains Chinese character, this exception is thrown.

Traceback (most recent call last):
  File "X/threading.py", line 639, in _bootstrap_inner
  File "X/threading.py", line 596, in run
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\salesforce\api.py", line 546, in execute_anonymous
    headers=headers)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\api.py", line 88, in post
    return request('post', url, data=data, **kwargs)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\sessions.py", line 338, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\sessions.py", line 441, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\adapters.py", line 292, in send
    timeout=timeout
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\packages\urllib3\connectionpool.py", line 428, in urlopen
    body=body, headers=headers)
  File "C:\Users\Administrator\Dropbox\Sublime3056\Data\Packages\SublimeApex\requests\packages\urllib3\connectionpool.py", line 280, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "X/http/client.py", line 1049, in request
  File "X/http/client.py", line 1086, in _send_request
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 1632-1633: ordinal not in range(256)
@sigmavirus24
Copy link
Contributor

File "X/http/client.py"

Did you write X because that's a path to a local file? If so, your directory structure may be confusing urllib3. If not, then you should probably raise this with on bugs.python.org since this is not something I think requests should be handling. This looks like it's rising from httplib (or http on Python 3 which I'm guessing you're using).

@xjsender
Copy link
Author

@sigmavirus24 ,

I used requests in sublime plugin, if the soap_body in below statement didn't contains any Chinese characters, there will be no exception.

response = requests.post(self.apex_url, soap_body, verify=False, headers=headers)

@Lukasa
Copy link
Member

Lukasa commented Dec 20, 2013

Firstly, unless you're using a different version of Sublime Apex to the one in their public repository, Requests is not the latest version, it's version 1.2.3. Can you tell me what version of Sublime Text you're using?

@xjsender
Copy link
Author

It's sublime text 3056

@Lukasa
Copy link
Member

Lukasa commented Dec 20, 2013

So, ST 3, but not the most recent revision. Ok, that gives us something. Specifically, Sublime Text 3 uses Python 3.3, not Python 2.7 (which Sublime Text 2 used). This means all the default strings in Sublime Apex are unicode strings.

If you open up the Python 3.3 http.client file, you'll find that the _send_request() function looks like this:

# Honor explicitly requested Host: and Accept-Encoding: headers.
header_names = dict.fromkeys([k.lower() for k in headers])
skips = {}
if 'host' in header_names:
    skips['skip_host'] = 1
if 'accept-encoding' in header_names:
    skips['skip_accept_encoding'] = 1

self.putrequest(method, url, **skips)

if body is not None and ('content-length' not in header_names):
    self._set_content_length(body)
for hdr, value in headers.items():
    self.putheader(hdr, value)
if isinstance(body, str):
    # RFC 2616 Section 3.7.1 says that text default has a
    # default charset of iso-8859-1.
    body = body.encode('iso-8859-1')
self.endheaders(body)

Now, ISO-8859-1 is an alias for Latin-1, which is the codec we're having trouble with. The problem we've got is that Sublime Apex is providing a unicode string body to Requests, which httplib needs to encode into bytes. Taking the default from RFC 2616, it concludes you want Latin-1, which doesn't include any Chinese characters. Clearly then, encoding fails, and you get the exception in question.

Considering that Sublime Apex claims in the headers it sends to be sending UTF-8 encoded data (which is a lie currently), Sublime Apex wants to be encoding the data as UTF-8 before sending it. This means any line sending data (in this case line 545 of salesforce/api.py) should read like this:

response = requests.post(self.apex_url, soap_body.encode('utf-8'), verify=False, headers=headers)

For the sake of anyone else who wants to confirm my diagnosis, here's a quick bit of sample code that confirms the problem:

a = "\u13E0\u19E0\u1320"
a.encode('latin1')  # Throws UnicodeEncodeError, proves that this can't be expressed in ISO-8859-1.
a.encode('utf-8')  # Totally fine.
r = requests.post('http://httpbin.org/post', data=a)  # Using unicode string, throws UnicodeEncodeError blaming Latin1.
r = requests.post('http://httpbin.org/post', data=a.encode('utf-8'))  # Works fine.

Thanks for raising this with us, but this is not a Requests bug. =)

@Lukasa Lukasa closed this as completed Dec 20, 2013
@xjsender
Copy link
Author

Thanks.

@wuminmin
Copy link

r = requests.post('http://httpbin.org/post', data=a.encode('utf-8'))
very usefull,
thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants