OCR language

By default, Paperwork uses Tesseract for the OCR. If unavailable, it falls back on Cuneiform.

To get better results, OCR tools need to know the language used in the document(s).

The language available in the settings dialog of Paperwork are those understood by the automatically-selected OCR tool (Tesseract or Cuneiform). If your language is not in the list, it means the OCR tool doesn't have the data required to read your language.

Debian

# OCR (Tesseract)
$ sudo apt-get install tesseract-ocr tesseract-ocr-<lang>

Fedora

# OCR (Tesseract)
$ sudo yum install tesseract tesseract-langpack-<lang>

Ubuntu

# OCR (Tesseract)
$ sudo apt-get install tesseract-ocr tesseract-ocr-<lang>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR language

Debian

Fedora

Ubuntu

Clone this wiki locally