New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Response should not return 'ISO-8859-1' as default encoding #1774
Comments
I'm 90% sure we just received a similar issue. Let me find it before I give you what I think I remember as the conclusion |
And the issue I was thinking of is still open. I'm closing this to centralize discussion over there with the added request that you please look at the open issues before you open a new one. Look closely and maybe even read some of the issue bodies because we're keeping a bunch of "discussion" issues open as well as already fixed issues. |
This issue has been raised many times in the past (please see #1737, #1604, #1589, #1588, #1546. There are others, but this list should be sufficient). The issue @sigmavirus24 is looking for is #1604. RFC 2616 is very clear here: if no encoding is declared in the Content-Type header, the encoding for text/html is assumed to be ISO-8859-1. If you know better, you are encouraged to either decode |
As usual @Lukasa is 100% correct. |
@Lukasa thanks for your explanation! I think not every user knows the detail defined in RFC2616, so should you add some comment on |
Adding to the documentation never hurts. It also doesn't hurt to make check |
Hi, the code that get encoding, when fetching http://lianxu.me/blog/2012/11/14/10-cocoa-objc-newbie-problems/, it will return default encoding 'ISO-8859-1' (The page's content-type is
text/html
, nottext/html; charset=utf-8
)And then, encoding is 'ISO-8859-1', so the text will call unicode(content, 'ISO-8859-1'), but the content is already utf-8 encoded, so this will return an invalid unicode string that I cannot call
unicode.decode('utf-8')
on it.I'll show you the code
I think requests should return None when no encoding found, otherwise this will lead wrong text that user cannot decode on it
The text was updated successfully, but these errors were encountered: