Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better codepage support #63

Open
ali1234 opened this issue Feb 1, 2021 · 1 comment
Open

Better codepage support #63

ali1234 opened this issue Feb 1, 2021 · 1 comment
Labels
Projects

Comments

@ali1234
Copy link
Owner

ali1234 commented Feb 1, 2021

Codepage support currently only works for Latin/English and partial Cyrillic support. Latter has to be manually selected and doesn't work for console output - only HTML.

My understanding of how it works:

  • For a given page there are two G0 charsets, and one G2 charset.
  • ESC switches between the two G0 charsets. It does not affect G2 characters.
  • (packets X/28/0 Format 1 or X/28/4) and (packets M/29/0 or M/29/4) define the charsets used by a page or magazine respectively.
  • In absence of those packets, defaults are programmed into the decoder? (ie local code of practice.)
  • Control bits in the page header further modify the charsets. Exact meaning depends on the current selected charsets or local code of practice. These bits are also duplicated in the X/28 and M/29 packets?
  • There are five basic regions: Latin, Greek, Cyrillic, Arabic, Hebrew
  • Latin and Cyrillic have national options. For Latin these redefine a fixed subset of characters. For Cyrillic they redefine different characters.
  • Hebrew doesn't have its own G2 charset.
@ali1234 ali1234 added the feature label Feb 1, 2021
@ali1234 ali1234 linked a pull request Feb 1, 2021 that will close this issue
@ali1234
Copy link
Owner Author

ali1234 commented Feb 1, 2021

It will be difficult to handle X/28 and M/29 packets in the current codebase because there is no state for the Parser object. That is, it doesn't know what packets have been processed before or after the current one. For HTML it is less of a problem because it operates on whole pages only, but it still can't handle M/29 - but this can be implemented in Service fairly easily.

Currently that means for console output we are limited to allowing the user to select a "local code of practice" which is used for all packets. That seems fairly close to what level 1 does, and we don't really support higher levels anyway.

  • To help the user do this, the console output should decode X/28 and M/29 packets when printing them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
v3.2.0
  
To do
Development

Successfully merging a pull request may close this issue.

1 participant