Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract 3.03 and PDF #2

Open
scruss opened this issue Jun 30, 2014 · 2 comments
Open

Tesseract 3.03 and PDF #2

scruss opened this issue Jun 30, 2014 · 2 comments

Comments

@scruss
Copy link

scruss commented Jun 30, 2014

Not so much an issue as a note: recent versions of Tesseract can recognize text and output the result in a PDF overlay:

tesseract infile.tif outfile pdf

This saves mucking about with hocr or any other intermediate text format.

@dcloud
Copy link
Contributor

dcloud commented Jul 2, 2014

Interesting. I see something about this in release notes but don't see any other reference to this. Will have to build from trunk and check this out. Thanks!

@scruss
Copy link
Author

scruss commented Jul 3, 2014

Hi Daniel,

Interesting. I see something about this in release notes
https://code.google.com/p/tesseract-ocr/wiki/ReleaseNotes but don't
see any other reference to this. Will have to build from trunk and check
this out. Thanks!

Current Ubuntu version is 3.03, should you wish to avoid a rebuild.
Homebrew on OS X, annoyingly, is still at 3.02.

cheers,
Stewart

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants