Skip to content
This repository has been archived by the owner on Jun 14, 2018. It is now read-only.

Confidence score #58

Open
MathieuCliche opened this issue Mar 24, 2017 · 6 comments
Open

Confidence score #58

MathieuCliche opened this issue Mar 24, 2017 · 6 comments

Comments

@MathieuCliche
Copy link

Is it possible to get a confidence score for the predictions (not orientation) ?

@jflesch
Copy link
Member

jflesch commented Mar 25, 2017

You mean one confidence score for the OCR on the whole image ?
I'm not even sure whether Tesseract provides such score.

@MathieuCliche
Copy link
Author

Yeah, for the whole image, or "per words". From what I read, tit's possible to get it from the hocr or tsv output. You can check it our here : https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage#tsv-output-currently-available-in-305-dev-in-master-branch-on-github

For example, the TSV output has a column "conf", which gives the confidence for each word.

@jflesch
Copy link
Member

jflesch commented Mar 25, 2017

Ok, good to know.
For the words, I guess it can be added as an attribute to pyocr.builders.Box objects.
Regarding the whole, with the current API, it's going to be a little more complicated ...

@jflesch
Copy link
Member

jflesch commented Nov 30, 2017

Per words, you can say thanks to @a-pagano : #86 :-)

@jflesch jflesch closed this as completed Nov 30, 2017
@jflesch
Copy link
Member

jflesch commented Nov 30, 2017

Sorry, I meant to keep this ticket opened regarding the confidence score for the whole page.

@jflesch jflesch reopened this Nov 30, 2017
@jflesch
Copy link
Member

jflesch commented Dec 14, 2017

Changes of @a-pagano have been released in Pyocr 0.5

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants