Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal IdentityEncoder should be more clear with rune handling #360

Open
gunnsth opened this issue May 23, 2020 · 0 comments
Open

internal IdentityEncoder should be more clear with rune handling #360

gunnsth opened this issue May 23, 2020 · 0 comments

Comments

@gunnsth
Copy link
Contributor

gunnsth commented May 23, 2020

The IdentityEncoder is used to represent Identity-H and Identity-V encodings, that are used to map 2-byte character codes to 2-byte CIDs:

The horizontal identity mapping for 2-byte CIDs; may be used with CIDFonts
using any Registry, Ordering, and Supplement values. It maps 2-byte character
codes ranging from 0 to 65,535 to the same 2-byte CID value, interpreted highorder
byte first.

When used with TrueType CID fonts, the CID values typically map directly to GID (glyph indices), where the CID value does not have any unicode meaning. Thus it can be confusing that it implements the TextEncoder interface, having methods such as CharcodeToRune where it is returning a "rune" that is not actually the utf-8 rune but just the integer value of the CID... This is confusing and can easily lead to problems.

We probably need to clarify the terminology and maybe split the TextEncoder interface up. The Identity-H should just map bytes to CIDs and such. If a CIDToGIDMap is defined that also needs to be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant