Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to convert .tif (multipages .tif file) to pdf using pdfsandwich on Ubuntu #52

Open
DEEPAK-KESWANI opened this issue Aug 1, 2018 · 3 comments

Comments

@DEEPAK-KESWANI
Copy link

DEEPAK-KESWANI commented Aug 1, 2018

Hi,

I have tried with OCR Action as well as the same command I tried on the Ubuntu terminal but with no luck.

I'm getting below error when I use OCR Action & when I execute below command directly from Terminal on Ubuntu for .tif (Multipages tif file) to .pdf file.

Can you please help on this?

$ /usr/bin/pdfsandwich -verbose -lang spa+eng+fra Sample_3_Multi_page.tif -o Sample_3_Multi_page.pdf
pdfsandwich version 0.1.4
Checking for convert:
convert -version
Version: ImageMagick 6.8.9-9 Q16 x86_64 2018-07-10 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2014 ImageMagick Studio LLC
Features: DPC Modules OpenMP
Delegates: bzlib cairo djvu fftw fontconfig freetype jbig jng jpeg lcms lqr ltdl lzma openexr pangocairo png rsvg tiff wmf x xml zlib

Checking for unpaper:
unpaper -version
6.1
Checking for tesseract:
tesseract -v
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

Checking for gs:
gs -v
GPL Ghostscript 9.18 (2015-10-05)
Copyright (C) 2015 Artifex Software, Inc. All rights reserved.
Input file: "Sample_3_Multi_page.tif"
Output file: "Sample_3_Multi_page.pdf"
Fatal error: exception Failure("Error: Could not determine number of pages of file Sample_3_Multi_page.tif")

Thanks.

@angelborroy-ks
Copy link
Contributor

Try using OCRmyPDF instead, it works better with TIFF files.

@DEEPAK-KESWANI
Copy link
Author

Hi Angel,

I have installed OCRmyPDF tool and through command line it works perfect but through addon, it throws below error:

What could be the issue?

Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 07020044 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /usr/local/bin/ocrmypdf --verbose 1 --force-ocr -l eng /opt/alfresco523/tomcat/temp/Alfresco/OCRTransformWorker_source_7224955351801332287.tiff /opt/alfresco523/tomcat/temp/Alfresco/OCRTransformWorker_source_7224955351801332287_ocr.pdf
succeeded: false
exit code: 1
out:
err: Traceback (most recent call last):
File "/usr/local/bin/ocrmypdf", line 7, in
from ocrmypdf.main import run_pipeline
File "/usr/local/lib/python3.5/dist-packages/ocrmypdf/init.py", line 35, in
from . import hocrt
at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86)
at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181)
... 10 more
Caused by: org.alfresco.service.cmr.repository.ContentIOException: 07020044 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /usr/local/bin/ocrmypdf --verbose 1 --force-ocr -l eng /opt/alfresco523/tomcat/temp/Alfresco/OCRTransformWorker_source_7224955351801332287.tiff /opt/alfresco523/tomcat/temp/Alfresco/OCRTransformWorker_source_7224955351801332287_ocr.pdf
succeeded: false
exit code: 1
out:
err: Traceback (most recent call last):
File "/usr/local/bin/ocrmypdf", line 7, in
from ocrmypdf.main import run_pipeline
File "/usr/local/lib/python3.5/dist-packages/ocrmypdf/init.py", line 35, in
from . import hocrt
at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79)
... 11 more

Thanks

@angelborroy-ks
Copy link
Contributor

If you are using Alfresco installer, probably you should create a shell script to isolate environment execution.

Some samples are available at FAQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants