Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) #13515

Merged
merged 1 commit into from
Jun 9, 2021

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Jun 8, 2021

This implementation is basically a copy of the pre-existing builtInCMapCache implementation.

For some, badly generated, PDF documents it's possible that we'll end up having to fetch the same standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses custom font names where the actual font then references one of the standard fonts; see e.g. issue #11399 for one such example. Edit: Loading all pages of that document currently causes the FoxitSymbol.pfb file to be loaded thirteen times.

Note that I did suggest adding worker-thread caching of standard font data in PR #12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Jun 8, 2021

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://54.67.70.0:8877/4e1c914f6aee4af/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jun 8, 2021

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 1

Live output at: http://3.101.106.178:8877/9f44160cf612609/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jun 8, 2021

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/4e1c914f6aee4af/output.txt

Total script time: 26.13 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/4e1c914f6aee4af/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Jun 8, 2021

From: Bot.io (Windows)


Failed

Full output at http://3.101.106.178:8877/9f44160cf612609/output.txt

Total script time: 29.22 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED

Image differences available at: http://3.101.106.178:8877/9f44160cf612609/reftest-analyzer.html#web=eq.log

@Snuffleupagus Snuffleupagus force-pushed the standardFontDataCache branch 2 times, most recently from 9faaec6 to 37c0a18 Compare June 9, 2021 16:27
…low-up)

*This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation.*

For some, badly generated, PDF documents it's possible that we'll end up having to fetch the *same* standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses *custom* font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example.

Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.
@timvandermeij timvandermeij merged commit 2a7827a into mozilla:master Jun 9, 2021
@timvandermeij
Copy link
Contributor

Looks good to me; thank you for doing this!

@Snuffleupagus Snuffleupagus deleted the standardFontDataCache branch June 9, 2021 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants