Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASR] Fix lexicon free decoder integration with wav2letter #5109

Merged
merged 1 commit into from
May 19, 2023

Conversation

vineelpratap
Copy link
Contributor

@vineelpratap vineelpratap commented May 19, 2023

The index of symbols in token dict and word dict should be the same for lexicon free decoder to work. This PR fixes it.

Tested on Librispeech using an n-gram character LM.

BEFORE: (Use LM weight = 4)

 | t h e | o l d h e r s | w h o | h u d | m a r e d | m a n d | h l d | n o t n c e | n | n t h e | l t l | o l | o n | p n c e | m a r y | h l | h u n r o u n d | h e r | b r o t h e r s | n e c e | b u t | s e n | t h e v | e | m p e r e | h o w | t h e | p l s r | h e o w | h n d | t o | r e t u r n | h e | h o l | h m a e |

AFTER: (Use LM weight = 4)


 | t h e | s o l d i e r s | w h o | h a d | c a r e d | f o r | a n d | h a d | n o t i c e d | a n d | t a k e n | t h e | t i t l e | g o l d | y o n | p r i n c e s | m a r y | h a d | h u n g | r o u n d | h e r | b r o t h e r ' s | n e c k | b u t | s e i n g | t h e | f a v o u r | t h e | e m p e r o r | s h o w e d | t h e | p r i s o n e r s | t h e y | n o w | h a s t e n e d | t o | r e t u r n | t h e | h o l y | i m a g e | |

@vineelpratap vineelpratap merged commit bfd9dc6 into main May 19, 2023
3 of 7 checks passed
@vineelpratap vineelpratap deleted the lexfree_fix2 branch May 19, 2023 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants