Better codepage support #63

ali1234 · 2021-02-01T03:01:32Z

Codepage support currently only works for Latin/English and partial Cyrillic support. Latter has to be manually selected and doesn't work for console output - only HTML.

My understanding of how it works:

For a given page there are two G0 charsets, and one G2 charset.
ESC switches between the two G0 charsets. It does not affect G2 characters.
(packets X/28/0 Format 1 or X/28/4) and (packets M/29/0 or M/29/4) define the charsets used by a page or magazine respectively.
In absence of those packets, defaults are programmed into the decoder? (ie local code of practice.)
Control bits in the page header further modify the charsets. Exact meaning depends on the current selected charsets or local code of practice. These bits are also duplicated in the X/28 and M/29 packets?
There are five basic regions: Latin, Greek, Cyrillic, Arabic, Hebrew
Latin and Cyrillic have national options. For Latin these redefine a fixed subset of characters. For Cyrillic they redefine different characters.
Hebrew doesn't have its own G2 charset.

ali1234 · 2021-02-01T03:09:41Z

It will be difficult to handle X/28 and M/29 packets in the current codebase because there is no state for the Parser object. That is, it doesn't know what packets have been processed before or after the current one. For HTML it is less of a problem because it operates on whole pages only, but it still can't handle M/29 - but this can be implemented in Service fairly easily.

Currently that means for console output we are limited to allowing the user to select a "local code of practice" which is used for all packets. That seems fairly close to what level 1 does, and we don't really support higher levels anyway.

To help the user do this, the console output should decode X/28 and M/29 packets when printing them.

ali1234 added the feature label Feb 1, 2021

ali1234 linked a pull request Feb 1, 2021 that will close this issue

Add support for cyrillic render to HTML #60

Closed

ali1234 mentioned this issue Feb 1, 2021

Implement charset support for different languages #51

Closed

ali1234 added this to To do in v3.2.0 via automation Feb 1, 2021

ali1234 mentioned this issue Feb 1, 2021

Elements should not call into the various Printers #64

Open

ali1234 mentioned this issue Feb 20, 2023

Improve support for international character sets #74

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better codepage support #63

Better codepage support #63

ali1234 commented Feb 1, 2021

ali1234 commented Feb 1, 2021 •

edited

Better codepage support #63

Better codepage support #63

Comments

ali1234 commented Feb 1, 2021

ali1234 commented Feb 1, 2021 • edited

ali1234 commented Feb 1, 2021 •

edited