Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with using other languages on OS X with tesseract installed with brew #23

Open
p7r opened this issue Oct 24, 2013 · 7 comments
Open

Comments

@p7r
Copy link

p7r commented Oct 24, 2013

When trying to do tesseract.rb -l ara or when setting up an Engine as follows:

tesseract = Tesseract::Engine.new{|e| 
# Note this fails for multiple values of e.path and for no value at all
    e.path = "/usr/local/Cellar/tesseract/3.02.02/share/"
    e.language = :ara 
  }

I'm getting this:

Failed loading language 'ara'
Tesseract couldn't load any languages!
/usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/api.rb:104:in `init': the API did not Init correctly (RuntimeError)
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:234:in `_init'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:54:in `initialize'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/call-me-0.0.2.3/lib/call-me/named.rb:207:in `block in named'
    from ./img2txt.rb:14:in `new'
    from ./img2txt.rb:14:in `<main>'

Tesseract itself is installed correctly and using the compiled binary that comes in the package, I am able to load Arabic language files and get OCR output.

Any suggestions gratefully received.

@p7r
Copy link
Author

p7r commented Oct 25, 2013

Further to this I removed all tesseract libraries on my machine and reinstalled them and the tesseract-ocr gem.

It seems with English it's fine, but it can't find the language files:

$ tesseract.rb 
/usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:248:in `_setup': you have to set an image first (ArgumentError)
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:149:in `text_for'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/call-me-0.0.2.3/lib/call-me/named.rb:207:in `block in named'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:77:in `block in <top (required)>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `tap'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `load'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `<main>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `eval'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `<main>'

$ tesseract.rb --help
Usage: tesseract [options]
        --path PATH                  datapath to set
    -l, --language LANGUAGE          language to use
    -m, --mode MODE                  mode to use
    -p, --psm MODE                   page segmentation mode to use
    -u, --unlv                       output in UNLV format
    -c, --confidence                 output the mean confidence of the recognition
    -C, --config PATH...             config files to load
    -b, --blacklist LIST             blacklist the following chars
    -w, --whitelist LIST             whitelist the following chars
    -s, --scale VALUE                scale the image before analyzing it
    -r, --resize VALUE               resize the image before analyzing it

$ tesseract.rb -l ara image.png
Failed loading language 'ara'
Tesseract couldn't load any languages!
/usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/api.rb:104:in `init': the API did not Init correctly (RuntimeError)
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:234:in `_init'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:54:in `initialize'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/call-me-0.0.2.3/lib/call-me/named.rb:207:in `block in named'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `new'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `load'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `<main>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `eval'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `<main>'

$ tesseract.rb -l ara --path /usr/local/Cellar/tesseract/3.02.02/share image.png
Failed loading language 'ara'
Tesseract couldn't load any languages!
/usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/api.rb:104:in `init': the API did not Init correctly (RuntimeError)
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:234:in `_init'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:54:in `initialize'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/call-me-0.0.2.3/lib/call-me/named.rb:207:in `block in named'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `new'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `load'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `<main>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `eval'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `<main>'

$ tesseract.rb -l ara --path /usr/local/Cellar/tesseract/3.02.02/share/tessdata image.png
Failed loading language 'ara'
Tesseract couldn't load any languages!
/usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/api.rb:104:in `init': the API did not Init correctly (RuntimeError)
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:234:in `_init'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/lib/tesseract/engine.rb:54:in `initialize'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/call-me-0.0.2.3/lib/call-me/named.rb:207:in `block in named'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `new'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/gems/tesseract-ocr-0.1.5/bin/tesseract.rb:57:in `<top (required)>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `load'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/tesseract.rb:19:in `<main>'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `eval'
    from /usr/local/rvm/gems/ruby-1.9.2-p290/bin/ruby_noexec_wrapper:14:in `<main>'
Pauls-Mac-mini:arabicocrtest paul$ tesseract.rb -l eng --path /usr/local/Cellar/tesseract/3.02.02/share/tessdata image.png
V L: _ i _ if __ r
., - 7-; f"::"'=:,  ’
‘HQ.’ .9 9 " x_. ‘
' .' ”- « >3)’   »
'5--4 war; -11-!  2.! u-r‘J:“fi-&“‘->s’9":‘;’,,‘,’ .4» ma

The garbage output is expected the only text in that image is Arabic.

@meh
Copy link
Owner

meh commented Oct 25, 2013

I'll have a look very soon (likely toward the end of the weekend).

@meh
Copy link
Owner

meh commented Nov 24, 2013

I'm very sorry I haven't looked into this yet, I've been very busy but I promise I will as soon as I have time.

@juniorjp
Copy link

I have the same problem trying the Nerdz example in this repo. The :lol language is not loaded.

@meh
Copy link
Owner

meh commented Feb 24, 2014

I think this is an OS X specific issue, and I don't have such a machine to fix this problem.

@juniorjp
Copy link

In my case my problem is using Ubuntu. If I change the :lol language( in the Nerdz example) to default :en everything works fine.

And it's the same error "the API did not Init correctly (RuntimeError)"

@meh
Copy link
Owner

meh commented Feb 24, 2014

@juniorjp1989 oh, that's almost good to know then, guess it's a problem with non Arch Linux systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants