Default and OEM character encodings in the Core edition should be Windows-1252, not ISO-8859-1

[ISO-8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) is currently (alpha16) the default character encoding, as well as when explicit encoding specifiers `Default` and `OEM` are used - see [here](https://github.com/PowerShell/PowerShell/issues/3248#issuecomment-284038694).

This **choice is problematic, because ISO-8859-1 is a _subset_ of the commonly used [Windows-1252](https://en.wikipedia.org/wiki/Windows-1252) encoding.**  
(The two encodings are often conflated, but they are _not_ the same.)

Specifically, **using ISO-8859-1 makes the following characters - the printable characters in the codepoint range `0x80 - 0x9F` - _unavailable_**:

    € ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ Ž ‘ ’ “ ” • – — ˜ ™ š › œ ž Ÿ

Note that the `€` character is part of that list.

You can verify the problematic behavior as follows:

    > '€' | Set-Content tmp.txt; Get-Content tmp.txt
    ?

Because `€` cannot be represented in ISO-8859-1, it was quietly converted to a _literal_ `?`.

Contrast this with use of Windows-1252:

    > $enc1252 = [System.Text.CodePagesEncodingProvider]::Instance.GetEncoding(1252); [IO.File]::WriteAllText('tmp.txt', '€', $enc1252); [IO.File]::ReadAllText('tmp.txt', $enc1252)
    €

The `€` char. - codepoint `0x80` in Windows-1252 (but not ISO-8859-1) - was correctly preserved.

---

Also, **please note that in order to fully emulate _Windows_ PowerShell behavior, using a _fixed_ encoding in Core is _not_ sufficient.**

Instead, the encoding would have to be locale-dependent, as on Windows:
Unix locales would have to be mapped to the Windows legacy codepages - see [here](https://github.com/PowerShell/PowerShell/issues/3248#issuecomment-284241580).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default and OEM character encodings in the Core edition should be Windows-1252, not ISO-8859-1 #3258

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Default and OEM character encodings in the Core edition should be Windows-1252, not ISO-8859-1 #3258

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions