Skip to content

Version 0.0.5 released.

Latest
Compare
Choose a tag to compare
@Jehan Jehan released this 05 Dec 12:25
· 65 commits to master since this release
  • Revert UTF-16 and UTF-32 label change:
    it was an error to specify endianness for texts with BOM.
    The Unicode standard explicitly warns against it, and it actually
    even (partially) breaks conversions.
  • Added supports:
    • French: Windows-1252.
    • German: ISO-8859-1, Windows-1252
    • Esperanto: ISO-8859-3
    • Turkish: ISO-8859-3 and ISO-8859-9
    • Thai: ISO-8859-11 (and TIS-620 model rebuilt).
  • Single Byte charset detection algorithm improved:
    detection of control characters lowers confidence.