Skip to content

Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)

License

Notifications You must be signed in to change notification settings

zabir-nabil/autoocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

autoocr

A Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)

Installations

pip3 install autoocr

Usage

Mac OS

  • Import the library
from autoocr import AutoOCR # import the AutoOCR class
  • Specify the language
oa = AutoOCR(lang='bangla') # specify the language code
  • Set the tessdata folder, on mac you can do brew list tesseract to get the path. This is only needed once.
oa.set_datapath('/usr/local/Cellar/tesseract/4.0.0_1/share/tessdata')
  • Get the text from image by passing the path to image
out_text = oa.get_text('image_ocr.jpg')

demo of autoocr on mac

Windows

  • Install tesseract engine

  • Import the library

from autoocr import AutoOCR # import the AutoOCR class
  • Specify the language
oa = AutoOCR(lang='bangla') # specify the language code
  • Set the tessdata folder. This is only needed once.
oa.set_datapath('/path/to/tessdata')
  • Get the text from image by passing the path to image
out_text = oa.get_text('image_ocr.jpg')

Linux

  • Install tesseract engine. Follow this page tesseract-ocr

  • Import the library

from autoocr import AutoOCR # import the AutoOCR class
  • Specify the language
oa = AutoOCR(lang='bangla') # specify the language code
  • Set the tessdata folder. This is only needed once. Run, rpm -ql tesseract for yum to get the location.
oa.set_datapath('/path/to/tessdata')
  • Get the text from image by passing the path to image
out_text = oa.get_text('image_ocr.jpg')

License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

About

Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages