ocr-to-iptc

Jupyter notebook script to run OCR (using Tesseract OCR) then save the results to the image's IPTC Caption.

Prerequisites

Jupyter Notebook or alternative

Tesseract OCR - don't forget to update the path to the Tesseract executable in the notebook.

Getting Started

you may need to install missing python libraries, notable ones:

pytesseract

iptcinfo3

tqdm

Put images you want to process in the ./src folder
Run the notebook
Review saved images in ./output

Miscellaneous

You can use ./samples for some test images.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
output		output
samples		samples
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ocr-to-iptc.ipynb		ocr-to-iptc.ipynb
sample.png		sample.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output

output

samples

samples

src

src

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

ocr-to-iptc.ipynb

ocr-to-iptc.ipynb

sample.png

sample.png

Repository files navigation

ocr-to-iptc

Prerequisites

Getting Started

Miscellaneous

About

Releases

Packages

Languages

License

andrewthong/ocr-to-iptc

Folders and files

Latest commit

History

Repository files navigation

ocr-to-iptc

Prerequisites

Getting Started

Miscellaneous

About

Resources

License

Stars

Watchers

Forks

Languages