Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing font #107

Open
dcaud opened this issue Feb 14, 2022 · 13 comments
Open

Missing font #107

dcaud opened this issue Feb 14, 2022 · 13 comments

Comments

@dcaud
Copy link

dcaud commented Feb 14, 2022

I have seen a lot of the following type of errors on various PDFs:

PDF error: Unknown font in field's DA string
PDF error: Missing 'Tf' operator in field's DA string

For example, this Alberta-tf-operator-error-CAV-2-FORMB.pdf file has text on the buttons on the second page (as Viewed in Mac's Preview or Adobe's Acrobat Pro DC). However, converting it to png, it loses that text and displays the missing font message in the R console.

pdftools::pdf_convert("Alberta-tf-operator-error-CAV-2-FORMB.pdf",
                      page=2)

This may be a PDF file that doesn't adhere to the PDF spec, but because many PDFs do not, I'd like this to work in some fashion.

Is there any way to get pdftools to render the button text in this example file? Maybe that would point to how this can be generalized to other PDFs with similar issues.

@jeroen
Copy link
Member

jeroen commented Feb 15, 2022

Hmm I'm not sure. I don't think the buttons contain any text, but actually a small image. If we extract the text it does not appear either:

cat(pdftools::pdf_text('Alberta-tf-operator-error-CAV-2-FORMB.pdf')[2])

But I am also not sure why the image does not appear in the output.

@jeroen
Copy link
Member

jeroen commented Feb 15, 2022

Oh it actually seems to work with a later version of the poppler library. Maybe I should update it again.

@jeroen
Copy link
Member

jeroen commented Feb 15, 2022

Which operating system do you use?

@dcaud
Copy link
Author

dcaud commented Feb 15, 2022

I'm using both Mac and Linux. Here's a profile from the Mac. Thanks for looking into this!

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] tools stats graphics grDevices utils datasets methods base

other attached packages:
[1] shinyBS_0.61 jsonlite_1.7.3 mongolite_2.4.1 ipc_0.1.3
[5] future_1.23.0 promises_1.2.0.1 googleAuthR_2.0.0 firebase_1.0.1
[9] RPostgres_1.4.3 pool_0.1.6 dplyr_1.0.8 shinyjs_2.1.0
[13] pdftools_3.0.1 shinybusy_0.2.2 shinyWidgets_0.6.4 magick_2.7.3
[17] colourpicker_1.1.1 shiny_1.7.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.8 lubridate_1.8.0 txtq_0.2.4 listenv_0.8.0
[5] assertthat_0.2.1 digest_0.6.29 utf8_1.2.2 parallelly_1.30.0
[9] mime_0.12 R6_2.5.1 backports_1.4.1 httr_1.4.2
[13] pillar_1.7.0 rlang_1.0.1 curl_4.3.2 rstudioapi_0.13
[17] fontawesome_0.2.2 miniUI_0.1.1.1 jquerylib_0.1.4 blob_1.2.2
[21] qpdf_1.1 htmlwidgets_1.5.4 bit_4.0.4 jose_1.2.0
[25] compiler_4.1.2 httpuv_1.6.5 pkgconfig_2.0.3 askpass_1.1
[29] base64enc_0.1-3 globals_0.14.0 htmltools_0.5.2 openssl_1.4.6
[33] tidyselect_1.1.1 tibble_3.1.6 codetools_0.2-18 fansi_1.0.2
[37] crayon_1.4.2 withr_2.4.3 later_1.3.0 xtable_1.8-4
[41] lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.2 cli_3.1.1
[45] cachem_1.0.6 fs_1.5.2 bslib_0.3.1 filelock_1.0.2
[49] ellipsis_0.3.2 generics_0.1.2 vctrs_0.3.8 bit64_4.0.5
[53] glue_1.6.1 purrr_0.3.4 hms_1.1.1 parallel_4.1.2
[57] fastmap_1.1.0 gargle_1.2.0 base64url_1.4 memoise_2.0.1
[61] sass_0.4.0

@jeroen
Copy link
Member

jeroen commented Feb 16, 2022

I have released a new version pdftools 3.1.0 that includes a more recent version of libpoppler for Windows and MacOS. You can test it from here:

install.packages("pdftools", repos = "https://ropensci.r-universe.dev")

For Linux it is a bit more tricky because we use the libpoppler that is included with your linux distribution. I think the problem should be fixed at least in ubuntu 22.04 that will be released in april, because it includes poppler 22.02: https://packages.ubuntu.com/jammy/libpoppler-dev

I'm not sure about the other distros, it really depends what OS you use.

@krcabrer
Copy link

I have the same issue, but in this case, I cannot update to pdftools version 3.1.0.

** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘pdftools’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-pdftools/00new/pdftools/libs/pdftools.so':
/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-pdftools/00new/pdftools/libs/pdftools.so: undefined symbol: _ZNK7poppler8text_box13has_font_infoEv
Error: loading failed
Ejecución interrumpida
ERROR: loading failed

  • removing ‘/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/pdftools’
  • restoring previous ‘/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/pdftools’

The downloaded source packages are in
‘/tmp/Rtmp8BrZD7/downloaded_packages’
Warning message:
In install.packages(c("pdftools")) :
installation of package ‘pdftools’ had non-zero exit status

Any workaround?

This is my platform:

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
[1] LC_CTYPE=es_CO.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_CO.UTF-8 LC_COLLATE=es_CO.UTF-8
[5] LC_MONETARY=es_CO.UTF-8 LC_MESSAGES=es_CO.UTF-8
[7] LC_PAPER=es_CO.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_CO.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_4.1.2

Thank you very much for your help.

Kenneth

@jeroen
Copy link
Member

jeroen commented Feb 22, 2022

@krcabrer it works for me on ubuntu 20.04. Can you please show the full output of your installation log? You probably have multiple, conflicting versions of poppler installed on your machine.

@krcabrer
Copy link

Dear @jeroen: Following is the complete log of the procedure. I also uninstall and purge poppler libs and then I install them again. Only one version. And the issue continued...

  • installing source package ‘pdftools’ ...
    ** package ‘pdftools’ successfully unpacked and MD5 sums checked
    ** using staged installation
    Found pkg-config cflags and libs!
    Using PKG_CFLAGS=-I/usr/local/include/poppler/cpp -I/usr/local/include/poppler
    Using PKG_LIBS=-L/usr/local/lib -lpoppler-cpp
    ** libs
    g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I/usr/local/include/poppler/cpp -I/usr/local/include/poppler -I'/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/Rcpp/include' -fvisibility=hidden -fpic -g -O2 -fdebug-prefix-map=/build/r-base-i2PIHO/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RcppExports.cpp -o RcppExports.o
    g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I/usr/local/include/poppler/cpp -I/usr/local/include/poppler -I'/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/Rcpp/include' -fvisibility=hidden -fpic -g -O2 -fdebug-prefix-map=/build/r-base-i2PIHO/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c bindings.cpp -o bindings.o
    g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o pdftools.so RcppExports.o bindings.o -L/usr/local/lib -lpoppler-cpp -L/usr/lib/R/lib -lR
    installing to /home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-pdftools/00new/pdftools/libs
    ** R
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** testing if installed package can be loaded from temporary location
    Error: package or namespace load failed for ‘pdftools’ in dyn.load(file, DLLpath = DLLpath, ...):
    unable to load shared object '/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-pdftools/00new/pdftools/libs/pdftools.so':
    /home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-pdftools/00new/pdftools/libs/pdftools.so: undefined symbol: _ZNK7poppler8text_box13has_font_infoEv
    Error: loading failed
    Ejecución interrumpida
    ERROR: loading failed
  • removing ‘/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/pdftools’
  • restoring previous ‘/home/kenneth/R/x86_64-pc-linux-gnu-library/4.1/pdftools’
    Warning in install.packages :
    installation of package ‘pdftools’ had non-zero exit status

Thank you for your help.

Kenneth

@krcabrer
Copy link

Dear @jeroen, I found the solution. I use this ppa repository for poppler.

sudo add-apt-repository ppa:bzamecnik/poppler

Then I update and now the package compilation works fine.

It seems that the problem is about the poppler default version that was installed on the system.

Greetings from Medellín, Colombia, South America.

Kenneth

@dcaud
Copy link
Author

dcaud commented Feb 27, 2022

Thanks for releasing pdftools 3.1.0, which seems likely to fix the issue I posted on Mac and Windows.

However, I'd like to use this on Linux. Waiting until April and then upgrading to the newer version of Linux will be quite difficult for me. I'm several linux distro's behind 22.

If that's the way to go, I'll try when that happens. If there is anyway to not make pdftools depend on Linux version for this fix, that'd be great...but ultimately this isn't a dealbreaker for me. Thanks!

@jeroen
Copy link
Member

jeroen commented Feb 27, 2022

We could create a ppa with a newer version of poppler. What disto are you using?

@dcaud
Copy link
Author

dcaud commented Feb 27, 2022

Updated.

Hi Jeroen. Thanks for looking into this. I'm using this distro:

Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye

I imagine that a ppa isn't really a longterm solution. If I wait until Apr. should the fix you suggested earlier work?

@dcaud
Copy link
Author

dcaud commented Apr 25, 2022

Hello again. I updated pdftools on Mac and the PDF mentioned in the first post of this thread now renders as expected on my Mac.

However, it doesn't render as expected on shinyapps.io. Any idea how to make it work there? @jeroen mentioned above that updating poppler may be tricky for ubuntu (which is what I think is used by shinyapps.io).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants