QA Spec #225

kba · 2022-09-13T09:29:35Z

This pull request offers our first draft for the QA Specs. It consists of two main parts:

the metrics definitions, provided in ocrd_eval.md (which is equal to https://pad.gwdg.de/rLDBVhmYQ8CwOd67KcYHwQ#)
~~the schema for the file format we want to use e.g. for the benchmarking, ocrd_eval.sample.yml~~ cf. QA Spec - Schema #236

ocrd_eval.schema.json

ocrd_eval.schema.yml

ocrd_tool.schema.json

mweidling · 2022-09-26T10:59:26Z

LGTM!

ocrd_eval.schema.json

ocrd_eval.schema.yml

Co-authored-by: mweidling <13831557+mweidling@users.noreply.github.com>

kba

I've merged all your proposals AFAICT.

For the future please modify only the YAML files, the JSON files are generated from them. I can also generate the JSON in a non-pretty-printed format to reduce confusion.

ocrd_eval.schema.yml

mweidling · 2022-09-27T11:26:42Z

I've merged all your proposals AFAICT.

For the future please modify only the YAML files, the JSON files are generated from them. I can also generate the JSON in a non-pretty-printed format to reduce confusion.

Yeah, I noticed that too – too late. 🤪 Will do in the future!

M3ssman · 2022-10-07T09:55:44Z

Just to make it as clear as possible: a character regarding these definitions is a glyph? Something printable visual, a graphical representation of a character? Saying so, any special whitespace codepoint (spatium, tab, zero-width spatium, invisible times, ... ) is not a character regarding OCR-D QA?

IMHO this is quite reasonable.

This doesn't apply to word-based metrics. But since usually structured GT shall be the backbone for evaluation, word boundaries or words at all are present already in the data if it is present at least on word level.

Since this implies concerning character-based textual evaluation to strip off any spaces forehand, it should be cleared on which level (line with spaces or finer) both GT and related candidate data are available.

If GT is for whatever reasons only on line-level present, I assume that these spaces are normalized too or even inserted by some legacy tooling meaning there's no reason either to keep these code points either.

ocrd_eval.sample.yml

paulpestov · 2022-12-12T20:12:36Z

Due to ongoing changes in the ocrd_eval schema I think we should omit those changes for this PR to separate the different issues that this PR is trying to solve: defining a first draft of metrics definitions and creating a JSON schema for the Quiver API. Since the API is still at a very early stage I'm not quite sure if it generates that much value for us if we create a spec right now.

mweidling · 2022-12-16T07:26:18Z

Due to ongoing changes in the ocrd_eval schema I think we should omit those changes for this PR to separate the different issues that this PR is trying to solve: defining a first draft of metrics definitions and creating a JSON schema for the Quiver API. Since the API is still at a very early stage I'm not quite sure if it generates that much value for us if we create a spec right now.

I second that. Since the requirements for the UI are not completely clear yet, we should move the JSON schema for the data to be delivered by the back end to a separate branch.

kba · 2022-12-19T12:38:10Z

I second that. Since the requirements for the UI are not completely clear yet, we should move the JSON schema for the data to be delivered by the back end to a separate branch.

Fine with me.

kba · 2022-12-19T12:55:57Z

Schema changes now in #236

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

ocrd_eval.md

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

ocrd_eval.md

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

bertsky

devil in the details...

ocrd_eval.md

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

bertsky

I can live with the PR as it is now.

If you want to elaborate in the definition BoW metrics, fine. If BWE should be replaced with BoW-Precision and BoW-Recall, also fine – as long as you give a concrete definition (ideally also mentioning which existing implementations provide which precise measure).

More references you might want to include:

Leifert & Labahn 2019: End-to-End Measure for Text Recognition (on CER and derived metrics, analysis of reading order, segmentation and geometry influences)
Zhang et al 2021: Rethinking Semantic Segmentation Evaluation for Explainability and Model Selection (semantic segmentation metrics like IoU discussed specifically with regard to over-segmentation and under-segmentation, proposes new metrics too)
Rice 1996: Measuring the Accuracy of Page-Reading Systems (on distance algorithms, rates, derived metrics)
Kanai & Rice 1995: Automated Evaluation of OCR Zoning (on layout evaluation, with metrics like Move Counting)
Alberti et al 2017: Open Evaluation Tool for Layout Analysis of Document Images (basic metrics around layout evaluation)
Clausner et al 2011: Scenario Driven In-Depth Performance Evaluation of Document Layout
Analysis Methods (original PRImA Layout Performance Score and discussion)

ocrd_eval.md

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

kba · 2023-03-01T12:06:12Z

Merged and wil release it later. There are still open questions and "postponed" metrics but it is an excellent first version we can and will build upon.

kba · 2023-03-01T12:07:33Z

If I missed an unresolved discussion or some aspect that should be tracked in a dedicated issue, please let me know and/or open an issue.

bertsky · 2023-03-01T12:39:00Z

The only open discussion is the one about BoW metrics. It's still somewhat valuable because it shows which implementations use which definitions. (Or should we add this to our Evaluation Wiki page?)

kba added 5 commits September 6, 2022 17:44

wip

e2d4038

Merge remote-tracking branch 'origin/master' into qa-spec

0cc4a0b

rewrite eval schema and saple according to OCR-D/zenhub#123

5aa6bd5

add metrics to ocrd_eval.md

6cd0caf

ocrd_eval: \begin{array}{ll} instead of .. {2}

b529531

kba marked this pull request as ready for review September 26, 2022 07:39

mweidling added 2 commits September 26, 2022 11:36

style(ocrd_eval.md): linting, formatting and correcting images

18333b8

stlye: add new line

fe9d6ff

mweidling reviewed Sep 26, 2022

View reviewed changes

mweidling mentioned this pull request Sep 27, 2022

Metrics definition as JSON OCR-D/zenhub#132

Closed

mweidling reviewed Sep 27, 2022

View reviewed changes

ocrd_eval.schema.json Outdated Show resolved Hide resolved

ocrd_eval.schema.yml Outdated Show resolved Hide resolved

kba and others added 4 commits September 27, 2022 11:59

Apply suggestions from code review

d7854a1

Co-authored-by: mweidling <13831557+mweidling@users.noreply.github.com>

Apply suggestions from code review

ee67881

Co-authored-by: mweidling <13831557+mweidling@users.noreply.github.com>

retcon JSON changes to YAML

5b35358

comment EvaluationMetrics back in

1aa048c

kba commented Sep 27, 2022

View reviewed changes

ocrd_eval.schema.yml Outdated Show resolved Hide resolved

ocrd_eval.schema.yml Outdated Show resolved Hide resolved

kba and others added 2 commits September 27, 2022 12:12

generate minimal JSON from YAML src

5840476

comment out undiscussed CER metrics

c9d313f

paulpestov requested changes Nov 23, 2022

View reviewed changes

ocrd_eval.sample.yml Outdated Show resolved Hide resolved

feat: move workflow_steps to ocr_workflow object

a814c89

kba added a commit that referenced this pull request Dec 19, 2022

remove markdown doc (see #225)

11965f5

kba mentioned this pull request Dec 19, 2022

QA Spec - Schema #236

Closed

remove schema from this branch, cf. #236

a881e08

Apply suggestions from code review

9ea4b62

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

bertsky reviewed Feb 10, 2023

View reviewed changes

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Show resolved Hide resolved

mweidling and others added 6 commits February 14, 2023 10:22

add bow metric

c910f0e

format document

8c22169

gpu mem instead of util

149a2eb

Update ocrd_eval.md

2999ef4

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

GPU Peak Memory definition

87f9438

Update ocrd_eval.md

5e80c94

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

bertsky reviewed Feb 14, 2023

View reviewed changes

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Show resolved Hide resolved

ocrd_eval.md Show resolved Hide resolved

mweidling and others added 7 commits February 15, 2023 08:11

Update ocrd_eval.md

5cd5efb

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

Update ocrd_eval.md

492b6ee

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

Update ocrd_eval.md

d8d4cef

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

add letter accuracy

48e69f8

rephrase layout eval intro

3cc5bee

add reading order evaluation

f817521

implement Uwe's feedback reg. Letter Accuracy

04c5c27

bertsky reviewed Feb 15, 2023

View reviewed changes

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

mweidling mentioned this pull request Feb 16, 2023

QA spec: reading order evaluation #238

Open

Apply suggestions from code review

d078b1b

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

bertsky approved these changes Feb 16, 2023

View reviewed changes

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

ocrd_eval.md Outdated Show resolved Hide resolved

This was referenced Feb 28, 2023

GT-level specific metrics #239

Open

Additional literature for QA Spec #240

Open

eval: Improvements to TeX formulas

43b364a

Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>

kba mentioned this pull request Mar 1, 2023

Add new metrics BoC and BoW qurator-spk/dinglehopper#60

Closed

kba merged commit a3505d7 into master Mar 1, 2023

kba deleted the qa-spec branch March 1, 2023 12:04

bertsky mentioned this pull request Mar 21, 2023

BoW metric implementation ulb-sachsen-anhalt/digital-eval#10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QA Spec #225

QA Spec #225

kba commented Sep 13, 2022 •

edited

mweidling commented Sep 26, 2022

kba left a comment

mweidling commented Sep 27, 2022

M3ssman commented Oct 7, 2022

paulpestov commented Dec 12, 2022

mweidling commented Dec 16, 2022

kba commented Dec 19, 2022

kba commented Dec 19, 2022

bertsky left a comment

bertsky left a comment

kba commented Mar 1, 2023

kba commented Mar 1, 2023

bertsky commented Mar 1, 2023

QA Spec #225

QA Spec #225

Conversation

kba commented Sep 13, 2022 • edited

mweidling commented Sep 26, 2022

kba left a comment

Choose a reason for hiding this comment

mweidling commented Sep 27, 2022

M3ssman commented Oct 7, 2022

paulpestov commented Dec 12, 2022

mweidling commented Dec 16, 2022

kba commented Dec 19, 2022

kba commented Dec 19, 2022

bertsky left a comment

Choose a reason for hiding this comment

bertsky left a comment

Choose a reason for hiding this comment

kba commented Mar 1, 2023

kba commented Mar 1, 2023

bertsky commented Mar 1, 2023

kba commented Sep 13, 2022 •

edited