Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TrueType spec compliance for generated font #123

Open
6 of 16 tasks
bsweeney opened this issue Oct 14, 2023 · 3 comments
Open
6 of 16 tasks

Improve TrueType spec compliance for generated font #123

bsweeney opened this issue Oct 14, 2023 · 3 comments

Comments

@bsweeney
Copy link
Member

bsweeney commented Oct 14, 2023

When generating a new font file this library currently writes out a significant number of font table data using the values specified in the original font file. The resulting font may have values that do not conform to expectations per the spec because a number of those values are dependent on the font structure, supported characters, and related glyphs.

This is a tracking issue to help identify and track these issues.

  • global table issues
  • font header
    • Font header invalidated by modification of the number of tables in the font
    • Table checksum does not match expectations
    • Tables in the directory are not sorted by tag.
  • head table
    • subsetting invalidates the font checksum
  • name table
  • cmap table
    • Unicode 0xFFFF (65535) should be present but not mapped
    • Improve character encoding support
    • improve error handling (Improve cmap table (subtable format 4) error handling #127)
    • add support for subtable format 0
    • add support for subtable format 2
    • add support for subtable format 6
    • add support for subtable format 8
    • add support for subtable format 10
    • add support for subtable format 13
    • add support for subtable format 14
@bsweeney
Copy link
Member Author

bsweeney commented Oct 15, 2023

Regarding table order in the directory, per the spec entries in the table directory must be sorted in ascending order by tag.
https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html

However, some tables depend on data from other tables having been encoded. Perhaps to address this issue we can write all table data, then write the sorted table directory entries.

bsweeney added a commit that referenced this issue Oct 16, 2023
From the TrueType font specification section about the cmap table:

> No character code should be mapped to glyph index -1 (0xFFFF), which is a special value reserved in processing to indicate the position of a glyph deleted from the glyph stream.

ref #123
bsweeney added a commit that referenced this issue Oct 16, 2023
...before calculating the head table checksum. Per the head table specification:

> checkSumAdjustment: To compute set it to 0, calculate the checksum for the 'head' table and put it in the table directory, sum the entire font as a uint32_t, then store 0xB1B0AFBA - sum. (The checksum for the 'head' table will be wrong as a result. That is OK; do not reset it.)

ref #123
bsweeney added a commit that referenced this issue Oct 16, 2023
The number of tables in the font may change when generating a font file. Because the font header is dependent on the number of tables the values need to be recalculated.

ref #123
@bsweeney
Copy link
Member Author

I'm looking at improving string encoding support but deeper analysis will need to be done to determine the full extent of scenarios need to be handled.

Of special note, support for platform 1 is being deferred for now. Primarily because the spec discourages use of this platform:

Names with platformID 1 were required by earlier versions of macOS. Its use on modern platforms is discouraged. Use names with platformID 3 instead for maximum compatibility. Some legacy software, however, may still require names with platformID 1, platformSpecificID 0.

Will revisit supporting platform 1 as needed since the platform uses legacy Macintosh text encodings:

Strings for the Macintosh platform (platform ID 1) use platform-specific single- or double-byte encodings according to the specified encoding ID for a given name record.

bsweeney added a commit that referenced this issue Dec 12, 2023
From the TrueType font specification section about the cmap table:

> No character code should be mapped to glyph index -1 (0xFFFF), which is a special value reserved in processing to indicate the position of a glyph deleted from the glyph stream.

ref #123
bsweeney added a commit that referenced this issue Dec 12, 2023
...before calculating the head table checksum. Per the head table specification:

> checkSumAdjustment: To compute set it to 0, calculate the checksum for the 'head' table and put it in the table directory, sum the entire font as a uint32_t, then store 0xB1B0AFBA - sum. (The checksum for the 'head' table will be wrong as a result. That is OK; do not reset it.)

ref #123
bsweeney added a commit that referenced this issue Dec 12, 2023
The number of tables in the font may change when generating a font file. Because the font header is dependent on the number of tables the values need to be recalculated.

ref #123
bsweeney added a commit that referenced this issue Dec 12, 2023
From the TrueType font specification section about the cmap table:

> No character code should be mapped to glyph index -1 (0xFFFF), which is a special value reserved in processing to indicate the position of a glyph deleted from the glyph stream.

ref #123
bsweeney added a commit that referenced this issue Dec 12, 2023
...before calculating the head table checksum. Per the head table specification:

> checkSumAdjustment: To compute set it to 0, calculate the checksum for the 'head' table and put it in the table directory, sum the entire font as a uint32_t, then store 0xB1B0AFBA - sum. (The checksum for the 'head' table will be wrong as a result. That is OK; do not reset it.)

ref #123
bsweeney added a commit that referenced this issue Dec 12, 2023
The number of tables in the font may change when generating a font file. Because the font header is dependent on the number of tables the values need to be recalculated.

ref #123
@bsweeney
Copy link
Member Author

bsweeney commented Dec 16, 2023

Might be useful to support encoding conversion (e.g., Big5 to Unicode). Some internal processes rely on a Unicode cmap. Notably, the library writes out the cmap table hard coded to platform ID 3, platform specific ID 1. So without a Unicode map writing the font will fail.

Unicode provides a collection of conversion tables:
https://github.com/unicode-org/icu-data/blob/main/charset/data/ucm

bsweeney added a commit that referenced this issue Dec 29, 2023
A font with a format 2 cmap table will still re-encode with a format 4 cmap table.

relates to #123
bsweeney added a commit that referenced this issue Dec 29, 2023
A font with a format 2 cmap table will still re-encode with a format 4 cmap table.

relates to #123
bsweeney added a commit that referenced this issue Dec 30, 2023
From the TrueType font specification section about the cmap table:

> No character code should be mapped to glyph index -1 (0xFFFF), which is a special value reserved in processing to indicate the position of a glyph deleted from the glyph stream.

ref #123
bsweeney added a commit that referenced this issue Dec 30, 2023
...before calculating the head table checksum. Per the head table specification:

> checkSumAdjustment: To compute set it to 0, calculate the checksum for the 'head' table and put it in the table directory, sum the entire font as a uint32_t, then store 0xB1B0AFBA - sum. (The checksum for the 'head' table will be wrong as a result. That is OK; do not reset it.)

ref #123
bsweeney added a commit that referenced this issue Dec 30, 2023
The number of tables in the font may change when generating a font file. Because the font header is dependent on the number of tables the values need to be recalculated.

ref #123
bsweeney added a commit that referenced this issue Dec 30, 2023
A font with a format 2 cmap table will still re-encode with a format 4 cmap table.

relates to #123
bsweeney added a commit that referenced this issue Jan 6, 2024
From the TrueType font specification section about the cmap table:

> No character code should be mapped to glyph index -1 (0xFFFF), which is a special value reserved in processing to indicate the position of a glyph deleted from the glyph stream.

ref #123
bsweeney added a commit that referenced this issue Jan 6, 2024
...before calculating the head table checksum. Per the head table specification:

> checkSumAdjustment: To compute set it to 0, calculate the checksum for the 'head' table and put it in the table directory, sum the entire font as a uint32_t, then store 0xB1B0AFBA - sum. (The checksum for the 'head' table will be wrong as a result. That is OK; do not reset it.)

ref #123
bsweeney added a commit that referenced this issue Jan 6, 2024
The number of tables in the font may change when generating a font file. Because the font header is dependent on the number of tables the values need to be recalculated.

ref #123
bsweeney added a commit that referenced this issue Jan 6, 2024
A font with a format 2 cmap table will still re-encode with a format 4 cmap table.

relates to #123
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant