Skip to content

wethrift/heroku-buildpack-tesseract

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heroku Buildpack Tesseract

This package provide a custom Heroku buildpack providing the Tesseract OCR binary and all the required libraries to Heroku apps. Training data for English language is provided by default (can be configured).

Configuration

The first step consists in allowing your Heroku app to use multiple buildpacks. Heroku natively supports multiple buildpacks per app.

  1. setup your app as

    heroku buildpacks:add --index 1 https://github.com/cofacts/heroku-buildpack-tesseract
    heroku buildpacks:set heroku/LANG
    

    where LANG is the language used by your app (e.g., ruby, python, or nodejs). A complete list of Heroku buildpacks can be found here.

    Note : You should make sure heroku/nodejs is initilized(execution order) after heroku-buildpack-tesseract, or npm automatically run will not work.

  2. If you want Tesseract to be able to work with any other languages than English, set the environment variable TESSERACT_OCR_LANGUAGES to a comma-separated string of ISO 639-2 language codes.

    $ heroku config:set TESSERACT_OCR_LANGUAGES="chi_tra"
  3. Push your code to Heroku

  4. You can use the tesseract binary in your Heroku app!

Note

This fork upgrades Tesseract binary version from 3.04.01 to 4.0

License

MIT License.

Original work Copyright (c) 2013 Marco Azimonti
Modified work Copyright (c) 2015 Matteo Maggioni
Modified work Copyright (c) 2017 Oswell Chan Modified work Copyright (c) 2018

About

Heroku Custom Buildpack for Tesseract OCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%