Incorrect/outdated documentation in README.md #316

pratheesh-prakash · 2022-10-05T05:10:35Z

In general, the documentation provided in README.md is very vague, and doesn't explain the training parameters and their impact on the output model.

Apart from the above, the information provided in the README.md is incorrect and outdated. Here are some major issues I have noticed.

Line 126 of README.md says

FINETUNE_TYPE Finetune Training Type - Impact, Plus, Layer or blank. Default: ''

However, Makfile doesn't seem to have any method to make use of this parameter. The help documentation (available through make help) also misses out this line. Is it because this option is unavailable in the later versions, or is it because the Makefile is outdated? Additionally, there is no information whatsoever on how these arguments (i.e. Plus, layer or '') would influence the training.

For plotting CER, according to README.md, the user must run './plot/plot_cer.sh'. Unfortunately, there exists no such shell-script in `plot'. Additionally, the python scripts provided in 'plot' would work only if the log-file is parsed to produce a csv.

The documentation also misses on how to interpret the results, how to optimise the hyperparameters, and how to improve the training data (For eg: how can we prevent 'Compute CTC targets failed' errors.).

It would be great if README.md is updated with latest information, and a more clear and detailed explanation of various parameters are provided.

The text was updated successfully, but these errors were encountered:

stale · 2022-11-13T00:06:52Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stweil · 2022-11-15T09:59:39Z

@pratheesh-prakash, do you want to send a pull request which improves that documentation?

pratheesh-prakash · 2022-11-15T10:19:26Z

@stweil: I really wish I could contribute to tesseract-ocr. But I do not have in-depth knowledge on the issues which I have raised. I have checked the documentation only to clarify those doubts, and found this information either missing or outdated in the documentation. I would suggest that the update be done by someone among the developers.

zdenop · 2023-02-20T17:37:47Z

Some details/explanation of whats happened is in #257.

stale bot added the stale Issues which require input by the reporter which is not provided label Nov 13, 2022

stale bot removed the stale Issues which require input by the reporter which is not provided label Nov 15, 2022

stweil added the enhancement New feature or request label Nov 15, 2022

zdenop mentioned this issue Feb 20, 2023

FINETUNE_TYPE gone? #333

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect/outdated documentation in README.md #316

Incorrect/outdated documentation in README.md #316

pratheesh-prakash commented Oct 5, 2022 •

edited

stale bot commented Nov 13, 2022

stweil commented Nov 15, 2022 •

edited

pratheesh-prakash commented Nov 15, 2022

zdenop commented Feb 20, 2023

Incorrect/outdated documentation in README.md #316

Incorrect/outdated documentation in README.md #316

Comments

pratheesh-prakash commented Oct 5, 2022 • edited

stale bot commented Nov 13, 2022

stweil commented Nov 15, 2022 • edited

pratheesh-prakash commented Nov 15, 2022

zdenop commented Feb 20, 2023

pratheesh-prakash commented Oct 5, 2022 •

edited

stweil commented Nov 15, 2022 •

edited