Jupyter notebook script to run OCR (using Tesseract OCR) then save the results to the image's IPTC Caption.
Jupyter Notebook or alternative
Tesseract OCR - don't forget to update the path to the Tesseract executable in the notebook.
you may need to install missing python libraries, notable ones:
- pytesseract
- iptcinfo3
- tqdm
- Put images you want to process in the
./src
folder - Run the notebook
- Review saved images in
./output
You can use ./samples
for some test images.