Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAR decoder error #96

Open
Gavinic opened this issue Jun 7, 2023 · 3 comments
Open

NAR decoder error #96

Gavinic opened this issue Jun 7, 2023 · 3 comments

Comments

@Gavinic
Copy link

Gavinic commented Jun 7, 2023

@baudm hi, thanks for your great work.
I have trained the parseq model with my own dataset, the max length is 89, and the chardict is 95full.yaml. when i test the textline with nar decoder, The result looks weird. there will be a lot of repeated characters. As shown below:
image:
image
GT:11 SORSOGON ST LEVITOWN, CITY OF PARANAQUE, NCR
PRE: 11 SORSOGON ST LEVITOWN, CITY OF PARANAQUE,,NNRR,,,ITTTOOOWNNN,, CCCIITTTYY O OFFFF R VVV
but when i use the ar decoder, there is no repeated characters. is there any solutions to alleviate the problem? thanks

@baudm
Copy link
Owner

baudm commented Jun 9, 2023

Off the top of my head, I could only hypothesize that the repeated characters are caused by NAR decoding failing to recognize the end of sequence. In short, [E] is not being decoded properly that's why you're seeing more characters than expected.

The current configuration (128x32 px images with 8x4 px patch size) is not expected to perform well for such a wide and short (in terms of height) text instance.

One thing you could try is to modify AR decoding to decode more than 1 character at a time. This is easily doable with the current codebase I think. You could increase the max length to 90, then decode 10 characters at once to minimize AR iterations.

@baudm
Copy link
Owner

baudm commented Jun 9, 2023

I also notice that you're trying to decode whitespace characters. That could also be a potential cause of the issue with NAR decoding. Another thing you could try is increasing the number of refinement iterations + ANDing the cloze mask with a mask which excludes low-confidence characters.

@Gavinic
Copy link
Author

Gavinic commented Jun 12, 2023

I also notice that you're trying to decode whitespace characters. That could also be a potential cause of the issue with NAR decoding. Another thing you could try is increasing the number of refinement iterations + ANDing the cloze mask with a mask which excludes low-confidence characters.

thank you very much, I will try the suggestion and feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants