-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poison 5.0 fails to decode Unicode surrogate pairs, Poison 4.0.1 succeeds #217
Comments
Yeah, running into this as well. Poison 5.0:
Poison 4.0.1:
|
It seems related to a zero-width joiner between two surrogate pairs. Shorter examples I tried on Poison 5.0.0: # with a zero-width joiner
iex(1)> Poison.decode(~S("\uD83D\uDC68\u200D\uD83D\uDC76"))
{:error,
%Poison.ParseError{
data: "\"\\uD83D\\uDC68\\u200D\\uD83D\\uDC76\"",
skip: 20,
value: "\\uD83D"
}}
# without a zero-width joiner
iex(2)> Poison.decode(~S("\uD83D\uDC68\uD83D\uDC76"))
{:ok, "👨👶"}
# with a zero-width joiner but the following character is not a surrogate pair
iex(3)> Poison.decode(~S("\uD83D\uDC6E\u200D\u2642"))
{:ok, "👮♂"} |
All of these examples fail in the browser using Poison 5.0 passes all spec tests, so I'm wary of allowing strings to parse that wont parse in a browser environment. |
Reproduction:
The text was updated successfully, but these errors were encountered: