Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training fail again and again #326

Open
Ham714 opened this issue Jan 10, 2023 · 9 comments
Open

training fail again and again #326

Ham714 opened this issue Jan 10, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@Ham714
Copy link

Ham714 commented Jan 10, 2023

Kindly help me to fix this issue i dont know what went wrong plz guide

python3 shuffle.py 0 "data/OCRA/all-lstmf"

  • head -n 269 data/OCRA/all-lstmf

  • tail -n 30 data/OCRA/all-lstmf

combine_lang_model \

--input_unicharset data/OCRA/unicharset \

--script_dir data/langdata \

--numbers data/OCRA/OCRA.numbers \

--puncs data/OCRA/OCRA.punc \

--words data/OCRA/OCRA.wordlist \

--output_dir data \

\

--lang OCRA

Failed to read data from: data/OCRA/OCRA.wordlist

Failed to read data from: data/OCRA/OCRA.punc

Failed to read data from: data/OCRA/OCRA.numbers

Loaded unicharset of size 112 from file data/OCRA/unicharset

Setting unichar properties

Other case É of é is not in unicharset

Setting script properties

Failed to load script unicharset from:data/langdata/Latin.unicharset

Warning: properties incomplete for index 3 = C

Warning: properties incomplete for index 4 = H

Warning: properties incomplete for index 5 = E

Warning: properties incomplete for index 6 = S

Warning: properties incomplete for index 7 = -

Warning: properties incomplete for index 8 = R

Warning: properties incomplete for index 9 = I

Warning: properties incomplete for index 10 = K

Warning: properties incomplete for index 11 = N

Warning: properties incomplete for index 12 = G

Warning: properties incomplete for index 13 = B

Warning: properties incomplete for index 14 = 8

Warning: properties incomplete for index 15 = 5

Warning: properties incomplete for index 16 = F

Warning: properties incomplete for index 17 = ,

Warning: properties incomplete for index 18 = (

Warning: properties incomplete for index 19 = /

Warning: properties incomplete for index 20 = L

Warning: properties incomplete for index 21 = T

Warning: properties incomplete for index 22 = )

Warning: properties incomplete for index 23 = O

Warning: properties incomplete for index 24 = Y

Warning: properties incomplete for index 25 = .

Warning: properties incomplete for index 26 = D

Warning: properties incomplete for index 27 = A

Warning: properties incomplete for index 28 = M

Warning: properties incomplete for index 29 = U

Warning: properties incomplete for index 30 = P

Warning: properties incomplete for index 31 = [

Warning: properties incomplete for index 32 = ]

Warning: properties incomplete for index 33 = 9

Warning: properties incomplete for index 34 = 7

Warning: properties incomplete for index 35 = 0

Warning: properties incomplete for index 36 = 1

Warning: properties incomplete for index 37 = 4

Warning: properties incomplete for index 38 = 2

Warning: properties incomplete for index 39 = W

Warning: properties incomplete for index 40 = 3

Warning: properties incomplete for index 41 = <

Warning: properties incomplete for index 42 = >

Warning: properties incomplete for index 43 = "

Warning: properties incomplete for index 44 = V

Warning: properties incomplete for index 45 = X

Warning: properties incomplete for index 46 = '

Warning: properties incomplete for index 47 = ~

Warning: properties incomplete for index 48 = !

Warning: properties incomplete for index 49 = J

Warning: properties incomplete for index 50 = Q

Warning: properties incomplete for index 51 = Z

Warning: properties incomplete for index 52 = +

Warning: properties incomplete for index 53 = @

Warning: properties incomplete for index 54 = &

Warning: properties incomplete for index 55 = ’

Warning: properties incomplete for index 56 = =

Warning: properties incomplete for index 57 = _

Warning: properties incomplete for index 58 = €

Warning: properties incomplete for index 59 = ™

Warning: properties incomplete for index 60 = “

Warning: properties incomplete for index 61 = |

Warning: properties incomplete for index 62 = ?

Warning: properties incomplete for index 63 = :

Warning: properties incomplete for index 64 = 6

Warning: properties incomplete for index 65 = {

Warning: properties incomplete for index 66 = }

Warning: properties incomplete for index 67 = $

Warning: properties incomplete for index 68 = ;

Warning: properties incomplete for index 69 = \

Warning: properties incomplete for index 70 = —

Warning: properties incomplete for index 71 = ”

Warning: properties incomplete for index 72 = *

Warning: properties incomplete for index 73 = #

Warning: properties incomplete for index 74 = »

Warning: properties incomplete for index 75 = ®

Warning: properties incomplete for index 76 = %

Warning: properties incomplete for index 77 = £

Warning: properties incomplete for index 78 = «

Warning: properties incomplete for index 79 = °

Warning: properties incomplete for index 80 = ©

Warning: properties incomplete for index 81 = §

Warning: properties incomplete for index 82 = ¥

Warning: properties incomplete for index 83 = ¢

Warning: properties incomplete for index 84 = ‘

Warning: properties incomplete for index 85 = i

Warning: properties incomplete for index 86 = n

Warning: properties incomplete for index 87 = c

Warning: properties incomplete for index 88 = u

Warning: properties incomplete for index 89 = l

Warning: properties incomplete for index 90 = a

Warning: properties incomplete for index 91 = t

Warning: properties incomplete for index 92 = e

Warning: properties incomplete for index 93 = m

Warning: properties incomplete for index 94 = o

Warning: properties incomplete for index 95 = s

Warning: properties incomplete for index 96 = g

Warning: properties incomplete for index 97 = h

Warning: properties incomplete for index 98 = b

Warning: properties incomplete for index 99 = z

Warning: properties incomplete for index 100 = v

Warning: properties incomplete for index 101 = q

Warning: properties incomplete for index 102 = f

Warning: properties incomplete for index 103 = r

Warning: properties incomplete for index 104 = w

Warning: properties incomplete for index 105 = p

Warning: properties incomplete for index 106 = d

Warning: properties incomplete for index 107 = k

Warning: properties incomplete for index 108 = x

Warning: properties incomplete for index 109 = y

Warning: properties incomplete for index 110 = j

Warning: properties incomplete for index 111 = é

Config file is optional, continuing...

Failed to read data from: data/langdata/OCRA/OCRA.config

Failed to read data from: data/langdata/radical-stroke.txt

Error reading radical code table data/langdata/radical-stroke.txt

make: *** [Makefile:293: data/OCRA/OCRA.traineddata] Error 1

@zdenop
Copy link
Contributor

zdenop commented Jan 10, 2023

  1. You ignored instructions in https://github.com/tesseract-ocr/tesstrain/blob/main/README.md
  2. You did not provided all steps you did including it outputs (logs)

@Ham714
Copy link
Author

Ham714 commented Jan 10, 2023

Actually i follow youtube video of training new font for tesseract 5
Sorry i will train according to the Readme you provide then i will post the results here

  1. You ignored instructions in https://github.com/tesseract-ocr/tesstrain/blob/main/README.md
  2. You did not provided all steps you did including it outputs (logs)

@zdenop
Copy link
Contributor

zdenop commented Jan 10, 2023

So you should raise issue at youtube ;-)

@Ham714
Copy link
Author

Ham714 commented Jan 11, 2023

So you should raise issue at youtube ;-)

Thank you soo much it worked training done i just train on 30 lines as a test and it gives good results i will train on all 193k line soon with new font

just want to ask one question text2image didn't generate box files for all the text in langdata-lstm why is this and it cause a problem during training

Then To resolve that error I will delete that .gt, .tiff,. box manually
Thank you again @zdenop

@stale
Copy link

stale bot commented May 22, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issues which require input by the reporter which is not provided label May 22, 2023
@stweil stweil added bug Something isn't working and removed stale Issues which require input by the reporter which is not provided labels Nov 2, 2023
@stweil
Copy link
Collaborator

stweil commented Nov 2, 2023

The current version of tesstrain requires users to run make tesseract-langdata before running the training. Older versions of tesstrain did not require this additional step which explains that there exist instructions which don't mention it.

I consider that not so user-friendly requirement a bug. tesstrain sould be fixed to get the required langdata files automatically if they are missing for the training (like it was done by old versions).

@zdenop
Copy link
Contributor

zdenop commented Nov 2, 2023

I would suggest to replace makefile with python script... Main reasons:

  • training is already python based
  • less dependencies (e.g. wget, find, bash, unzip and bc)
  • better portability (it seems that make behave differently on Windows and Linux...)
  • hopefully more contributors (there are already some python script posted in tesseract forum for training)

@stweil
Copy link
Collaborator

stweil commented Nov 2, 2023

I fully agree.

@bertsky
Copy link
Collaborator

bertsky commented Mar 3, 2024

The current version of tesstrain requires users to run make tesseract-langdata before running the training. Older versions of tesstrain did not require this additional step which explains that there exist instructions which don't mention it.

That bug has been fixed in #373.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants