Skip to content

forstleblue/pythonOCRimageToText

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pythonOCRimageToText

This script convert image to text using tesserocr tesserocr

A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR).

TravisCI build status Latest version on PyPi Supported python versions

tesserocr integrates directly with Tesseract's C++ API using Cython which allows for a simple Pythonic and easy-to-read source code. It enables real concurrent execution when used with Python's threading module by releasing the GIL while processing an image in tesseract.

tesserocr is designed to be Pillow-friendly but can also be used with image files instead.

Requirements

Requires libtesseract (>=3.04) and libleptonica (>=1.71).

On Debian/Ubuntu:

$ apt-get install tesseract-ocr libtesseract-dev libleptonica-dev You may need to manually compile tesseract for a more recent version. Note that you may need to update your LD_LIBRARY_PATH environment variable to point to the right library versions in case you have multiple tesseract/leptonica installations.

Cython is required for building and optionally Pillow to support PIL.Image objects.

Installation

$ pip install tesserocr

About

This script convert image to text using tesserocr

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages