Skip to content
This repository has been archived by the owner on Oct 23, 2019. It is now read-only.

mbstring serialize/unserialize incorrect behaviour compared to PHP #68

Open
lucyllewy opened this issue Oct 8, 2016 · 1 comment
Open
Labels

Comments

@lucyllewy
Copy link
Contributor

serializing a multibyte character and then unserializing it again in Phalanger causes character to change when echoed as shown in the testcase below.

Phalanger behaves differently to PHP in this respect:

  • PHP will return é in both echo attempts with $val1 and $val2 respectively.
  • Phalanger will return é from $val1, but ? from $val2..
<?php
$val1 = 'é';
$val2 = unserialize(serialize($val1));

$EOL = "<br>\n";

// let's test equality of the supposedly equal characters
echo ($val1 === $val2 ? 'Equal' : 'Not Equal') . $EOL; // Returns 'Equal', because $val1 === $val2 === 'é'
echo htmlentities($val1) . $EOL; // Returns html entity for 'é'
echo htmlentities($val2) . $EOL; // Returns '?' (Note that $val1 === $val2 !)
@jakubmisek
Copy link
Member

serialize/unserialize translates unicode strings according to current PageEncoding which is set to default windows culture by default. I would always recommend to set PageEncoding to UTF-8 to avoid all these issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants