Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in output generated by gem and tesseract command line #40

Open
Meenal-goyal opened this issue Jul 11, 2014 · 8 comments
Open

Comments

@Meenal-goyal
Copy link

I was trying to extract text from image using tesseract command line but since I wanted to use ruby script I tried your gem. Now, the problem is I am getting different output by gem. Also in some cases gem is not performing at par and giving bad output.
Is there any version difference?
Additional info:

$ tesseract -v
tesseract 3.02.02
leptonica-1.69
libjpeg 8d : libpng 1.6.12 : zlib 1.2.5

What version is gem using?

@meh
Copy link
Owner

meh commented Jul 11, 2014

The gem uses the version installed on the system.

@Meenal-goyal
Copy link
Author

Then what's the reason of getting different output? Is it possible that may be gem uses the older version of tesseract installed on system instead of the new version?
I have got only latest version on my system but may be it has support for older versions as well.

@meh
Copy link
Owner

meh commented Jul 11, 2014

No, that's not how it works. The only possible reason is different default options between the binary and the library.

@Meenal-goyal
Copy link
Author

So, how can i change these options for the binary? Also I wanted to set extra configuration variables like matcher_good_threshold etc. what option should i give in the ruby script?

@cwulfman
Copy link

Was there ever an answer for this question? I'm having the same problem. This may not be the right place to ask, but how can I see the default configuration being used by the binary so I can pass that configuration into the gem?

@meh
Copy link
Owner

meh commented May 11, 2015

I honestly don't know, someone should have to dig around the binary's source code to figure out what differing default options are there.

@cwulfman
Copy link

Ok; thank you.

On May 11, 2015, at 15:21, meh. notifications@github.com wrote:

I honestly don't know, someone should have to dig around the binary's source code to figure out what differing default options are there.


Reply to this email directly or view it on GitHub #40 (comment).

@amitdo
Copy link

amitdo commented Feb 18, 2016

@meh,
FYI, the default psm mode for tesseract command line is '3', while for libtesseract it's '6'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants