pypdfocr-docker

PyPDFOCR on Docker

get rid of your paperwork...

what is pypdfocr

PyPDFOCR converts a scanned PDF into an OCR'ed PDF using Tesseract-OCR and Ghostscript

Dockerfile

Trusted Build

This Docker image is based on the official Ubuntu base image.

It incorporates a patch for issue #41 of pypdfocr 0.9.0 likely to be fixed in 0.9.1

How to use this image

docker run --rm mmatiaschek/pypdfocr [-h] [-d] [-v] [-m] [-l LANG] [--preprocess]
                [--skip-preprocess] [-w WATCH_DIR] [-f] [-c CONFIGFILE] [-e]
                [-n]
                [pdf_filename]

Case 1: Single Document

docker run -v ~/:/media --rm pypdfocr /media/filename.pdf

--> reads filename.pdf from your Home directory, filename_ocr.pdf will be generated

Case 2 : Watch folder

docker run -v ~/Documents/Paper:/media --rm mmatiaschek/pypdfocr -w /media -f -c /media/config.yaml

For sample config see config.yaml or pypdfocr authors repository here.

Help

docker run --rm mmatiaschek/pypdfocr [-h] [-d] [-v] [-m] [-l LANG] [--preprocess]
                [--skip-preprocess] [-w WATCH_DIR] [-f] [-c CONFIGFILE] [-e]
                [-n]
                [pdf_filename]

Interactive Shell

docker run --entrypoint=/bin/bash -t -i mmatiaschek/pypdfocr

How i use this image

I use Scanner Pro on iOS (scanbot on Android) to scan and upload documents to a WebDAV folder without OCR
The WebDAV folder is hosted on my Synology DiskStation NAS via HTTPS and shared between devices with CloudStation
I run this PyPDFOCR on Docker manually on Mac OS X or hosted on a local server

This way my personal documents don't have to leave my hardware or network aka personal cloud.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Dockerfile		Dockerfile
README.md		README.md
config.yaml		config.yaml
issue_41.patch		issue_41.patch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerfile

Dockerfile

README.md

README.md

config.yaml

config.yaml

issue_41.patch

issue_41.patch

Repository files navigation

pypdfocr-docker

what is pypdfocr

Dockerfile

How to use this image

Case 1: Single Document

Case 2 : Watch folder

Help

How i use this image

About

Releases

Packages

mmatiaschek/pypdfocr-docker

Folders and files

Latest commit

History

Repository files navigation

pypdfocr-docker

what is pypdfocr

Dockerfile

How to use this image

Case 1: Single Document

Case 2 : Watch folder

Help

How i use this image

About

Resources

Stars

Watchers

Forks