Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace most Foxit fonts with Liberation fonts #14843

Open
Snuffleupagus opened this issue Apr 26, 2022 · 14 comments · Fixed by #16363
Open

Replace most Foxit fonts with Liberation fonts #14843

Snuffleupagus opened this issue Apr 26, 2022 · 14 comments · Fixed by #16363

Comments

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Apr 26, 2022

As seen in #12726 (comment) the Foxit fonts are not "complete" enough, which leads to broken rendering in some PDF documents.
(This is one reason that we're not unconditionally always loading the standard font data-files, another being that doing so affects performance according to the discussion in that PR.)

Furthermore, as suggested and agreed upon in #12726 (comment), #12726 (comment), and #12726 (comment), the idea was to replace the Foxit fonts (except FoxitSymbol.pfb and FoxitDingbats.pfb) with the more complete Liberation fonts instead.

Note that we already ship some Liberation fonts, for use with XFA-documents, and that we thus currently have some duplication in the standard font files. Hence we should replace the FoxitFixed... and FoxitSerif... fonts with their Liberation equivalents.

/cc @calixteman

@marco-c
Copy link
Contributor

marco-c commented Feb 10, 2023

Would this reduce the size of the built-in PDF.js viewer?

@Snuffleupagus
Copy link
Collaborator Author

Would this reduce the size of the built-in PDF.js viewer?

Difficult to tell without trying, and it might also be possible to compact the Liberation fonts.
However, please note that as mentioned above we ship some "duplicated" font-data and the Foxit fonts are also "incomplete".

@marco-c
Copy link
Contributor

marco-c commented Feb 10, 2023

OK, thanks. It seems worth investigating since on GeckoView size is important.

@marco-c
Copy link
Contributor

marco-c commented Apr 27, 2023

#16363 is going to do part of this.

@calixteman
Copy link
Contributor

I need to make some tests to be sure that everything is fine, but my plan is to convert the different Liberations fonts we need (except LiberationSans because we want to be able to rescale it on the fly for XFA) into WOFF2 format.
I hope we could limit the total size increase to something between 200 and 300Kb which should acceptable on Android.

@calixteman calixteman self-assigned this Apr 27, 2023
@Snuffleupagus
Copy link
Collaborator Author

Snuffleupagus commented Apr 27, 2023

Please keep the Node.js, or browsers with disableFontFace = true set, use-case in mind since we then load and parse the standard-font completely on the worker-thread (including using the FontRenderer).

@calixteman
Copy link
Contributor

@Snuffleupagus, do you have any idea on how disableFontFace = true is useful:

  • in the Firefox context ?
  • in the browser context ?

At least in the Firefox context could you imagine a case where this option is really useful ?
Especially if we remove it what kind of complains could we have ?
I tend to think that almost nobody change this kind of option in about:config so it should false for most of the user.
I want to keep the case where we need to fallback on the FontRenderer to draw the glyphs but do you think it's possible to fail when binding a WOFF2 font ?
If it's really necessary to keep the option, in case it's set to true then we could just keep the fontface stuff for the woff2 standard fonts and use the FontRenderer for the other ones, wdyt ? I'd prefer not having such exceptions it's why I'd prefer removing the option.

@Snuffleupagus
Copy link
Collaborator Author

do you have any idea on how disableFontFace = true is useful:
* in the Firefox context ?
* in the browser context ?

As previously mentioned, that option is mostly used in Node.js environments. Hence if you remove that you've essentially made PDF.js significantly less usable in Node.js, which is bound to lead to a never ending stream of user complaints (and it'd probably be an api-major change).

I want to keep the case where we need to fallback on the FontRenderer to draw the glyphs but do you think it's possible to fail when binding a WOFF2 font ?

Please note that all of this requires the font-data to be parse-able by the code in src/core/fonts.js, possibly with the aid of our Type1/CFF parsers, since I really cannot imagine how it'd work otherwise.
Edit: Note also the "FontFallback"-code, on both the main- and worker-threads, when a font for some reason fails to load (e.g. if it's rejected by the sanitizer).

If the WOFF2 format is sufficiently compatible with TrueType and/or OpenType, you might be able to "extract" usable parts from it such that our existing font-parsing works; see e.g. how the TrueTypeCollection-support is implemented.
Edit: Also, if the user has disabled web-fonts in the browser we fallback to disableFontFace=true to actually be able to render the document properly; note this code.

@calixteman
Copy link
Contributor

Afaik WOFF2 format is different of the usual formats (at least because data are compressed with brotli), so we won't be able to pass a woff2 file to any parsers we've.
My idea is just to use it as a possible fallback like here: https://github.com/mozilla/pdf.js/pull/16363/files#diff-84365a5fa783e74d4c83766a21305541dedb5d8ccefa51df9f95fa3a38bf0cfcR159
Can you imagine any drawbacks for this approach ?

@calixteman
Copy link
Contributor

Edit: Also, if the user has disabled web-fonts in the browser we fallback to disableFontFace=true to actually be able to render the document properly; note this code.

Interesting, I need to think about that.

@Snuffleupagus
Copy link
Collaborator Author

My idea is just to use it as a possible fallback like here: https://github.com/mozilla/pdf.js/pull/16363/files#diff-84365a5fa783e74d4c83766a21305541dedb5d8ccefa51df9f95fa3a38bf0cfcR159
Can you imagine any drawbacks for this approach ?

As mentioned that simply won't work in either Node.js environments or when we fallback to disableFontFace=true in the viewer when web-fonts are disabled in the browser. In both of those cases we need to fetch the standard font-data and then parse it just like we'd do with a "normal" embedded PDF font before finally using the FontRenderer.

Basically, if you want either of those cases to still work then the standard font-data files need to still be parse-able by the "regular" PDF.js font-code as far as I can tell.

@Snuffleupagus
Copy link
Collaborator Author

Another idea/question: Could we perhaps convert the Liberation fonts to CFF instead? That should be more compact than TrueType, while still retaining the ability to pass them to the regular PDF.js font-code; please also see #12726 (comment).

@calixteman
Copy link
Contributor

calixteman commented Apr 28, 2023

There are few tools we can play with to convert the font:

For example with the LiberationSans-Regular.ttf we've in the repo:

  • just running makeotf (makeotf -f LiberationSans-Regular.ttf -o out.otf fails because of FATAL: cmap{plat=3,script=1,lang=0}: multiple glyphs (uni00A0,dagger) mapped to code <A0>
  • but we can strip out few tables to make it happy: pyftsubset LiberationSans-Regular.ttf --unicodes="0000-FFFF" --ignore-missing-glyphs --drop-tables-="GPOS","name","GDEF" --layout-scripts="" --no-subset-tables+="FFTM"

Just doing that reduces the size from 139K to126K... it isn't exciting and it's because of hinting code.
So it's possible to strip out hinting in using ttfautohint --dehint LiberationSans-Regular.ttf out.ttf.
At the end we've something (after ttfautohint, pyftsubset and makeotf) we've a file around 69K.
But now if I run on the file generated by the subsetter (so no makeotf) the woff2 convert (I managed to run it from python shell and fonttools package) I get a file around 30K but if I run the compressor on the otf file then it's around 68K.

If now I run ttfautohint without -dehint (to keep some hinting informations), then run the subsetter and finally compress into woff2 I get a 36K files (and we've hinting data).

So about node.js we can keep ttf fonts (and strip out whatever we want).
About Firefox, at least for android (which only have Droid fonts as system fallback) I'd really prefer using woff2 (and if we need, just add a specific check in the Firefox code to allow web-fonts with a pdfjs principal). Potentially we could do the same with desktop. In both cases I need to know if there are any potential drawbacks for adding this specific check.
For other browsers we can do the same as in node.js.

I'll ask tuesday (monday is a bank holiday for a lot of people) if we have some telemetry about disabled web-fonts.

@Snuffleupagus
Copy link
Collaborator Author

Snuffleupagus commented Apr 30, 2023

(and if we need, just add a specific check in the Firefox code to allow web-fonts with a pdfjs principal). Potentially we could do the same with desktop. In both cases I need to know if there are any potential drawbacks for adding this specific check.

If I were to guess, I'd say that users probably disable web-fonts for security reasons (and I cannot imagine it being common).

There's obviously a question if the fonts embedded in PDF documents should actually be considered "web-fonts" as such, and it might indeed be fine to just allow the built-in Firefox PDF Viewer to always use PDF fonts (regardless of user-settings).
Please note: Given the font-parsing code in the PDF.js library we're not actually passing the PDF fonts as-is to the browser, and the fonts are also sanitized (with OTS) in the browser before being passed on to the OS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Low priority
PDF.js quality
Low priority
3 participants