Skip to content

andrewthong/ocr-to-iptc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ocr-to-iptc

Jupyter notebook script to run OCR (using Tesseract OCR) then save the results to the image's IPTC Caption.

Prerequisites

Jupyter Notebook or alternative

Tesseract OCR - don't forget to update the path to the Tesseract executable in the notebook.

Getting Started

you may need to install missing python libraries, notable ones:

  • pytesseract
  • iptcinfo3
  • tqdm
  1. Put images you want to process in the ./src folder
  2. Run the notebook
  3. Review saved images in ./output

Miscellaneous

You can use ./samples for some test images.

About

Jupyter notebook to run OCR on images then save results to metadata

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published