Cuneiform: Split text areas before OCR #2

jflesch · 2012-04-25T20:29:00Z

Cuneiform tends to stop reading pages when it reachs a large non-readable area. Because of this, when using Cuneiform, all the keywords are not actually extracted.

A way to work around this problem would be to split the text areas prior to OCR.

For instance, unpaper can do that (ocrfeeder uses it).

jflesch · 2016-10-26T15:40:19Z

Note : Image processing algorithm should be added in https://github.com/jflesch/libpillowfight/ , not in PyOCR directly.

jflesch added feature request to study labels Apr 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuneiform: Split text areas before OCR #2

Cuneiform: Split text areas before OCR #2

jflesch commented Apr 25, 2012

jflesch commented Oct 26, 2016

Cuneiform: Split text areas before OCR #2

Cuneiform: Split text areas before OCR #2

Comments

jflesch commented Apr 25, 2012

jflesch commented Oct 26, 2016