Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Snuffleupagus · 2021-06-08T12:11:01Z

This implementation is basically a copy of the pre-existing builtInCMapCache implementation.

For some, badly generated, PDF documents it's possible that we'll end up having to fetch the same standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses custom font names where the actual font then references one of the standard fonts; see e.g. issue #11399 for one such example. Edit: Loading all pages of that document currently causes the FoxitSymbol.pfb file to be loaded thirteen times.

Note that I did suggest adding worker-thread caching of standard font data in PR #12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.

Snuffleupagus · 2021-06-08T13:37:04Z

/botio test

pdfjsbot · 2021-06-08T13:37:05Z

From: Bot.io (Linux m4)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://54.67.70.0:8877/4e1c914f6aee4af/output.txt

pdfjsbot · 2021-06-08T13:37:05Z

From: Bot.io (Windows)

Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://3.101.106.178:8877/9f44160cf612609/output.txt

pdfjsbot · 2021-06-08T14:29:18Z

From: Bot.io (Linux m4)

Failed

Full output at http://54.67.70.0:8877/4e1c914f6aee4af/output.txt

Total script time: 26.13 mins

Font tests: Passed
Unit tests: Passed
Integration Tests: Passed
Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/4e1c914f6aee4af/reftest-analyzer.html#web=eq.log

pdfjsbot · 2021-06-08T14:35:21Z

From: Bot.io (Windows)

Failed

Full output at http://3.101.106.178:8877/9f44160cf612609/output.txt

Total script time: 29.22 mins

Font tests: Passed
Unit tests: Passed
Integration Tests: FAILED
Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/9f44160cf612609/reftest-analyzer.html#web=eq.log

…low-up) *This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation.* For some, badly generated, PDF documents it's possible that we'll end up having to fetch the *same* standard font data over and over (which is obviously inefficient). While not common, it's certainly possible that a PDF document uses *custom* font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example. Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.

timvandermeij · 2021-06-09T19:06:16Z

Looks good to me; thank you for doing this!

Snuffleupagus added core performance font-conversion labels Jun 8, 2021

Snuffleupagus force-pushed the standardFontDataCache branch 2 times, most recently from d5bd3f6 to 11e3b7c Compare June 8, 2021 13:00

Snuffleupagus force-pushed the standardFontDataCache branch 2 times, most recently from 9faaec6 to 37c0a18 Compare June 9, 2021 16:27

Snuffleupagus force-pushed the standardFontDataCache branch from 37c0a18 to a01c599 Compare June 9, 2021 16:27

timvandermeij approved these changes Jun 9, 2021

View reviewed changes

timvandermeij merged commit 2a7827a into mozilla:master Jun 9, 2021

Snuffleupagus deleted the standardFontDataCache branch June 9, 2021 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Snuffleupagus commented Jun 8, 2021 •

edited

Snuffleupagus commented Jun 8, 2021

pdfjsbot commented Jun 8, 2021

pdfjsbot commented Jun 8, 2021

pdfjsbot commented Jun 8, 2021

pdfjsbot commented Jun 8, 2021

timvandermeij commented Jun 9, 2021

Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Conversation

Snuffleupagus commented Jun 8, 2021 • edited

Snuffleupagus commented Jun 8, 2021

pdfjsbot commented Jun 8, 2021

From: Bot.io (Linux m4)

Received

pdfjsbot commented Jun 8, 2021

From: Bot.io (Windows)

Received

pdfjsbot commented Jun 8, 2021

From: Bot.io (Linux m4)

Failed

pdfjsbot commented Jun 8, 2021

From: Bot.io (Windows)

Failed

timvandermeij commented Jun 9, 2021

Snuffleupagus commented Jun 8, 2021 •

edited