Skip to content
This repository has been archived by the owner on Aug 3, 2021. It is now read-only.

INFO: Skipping trie generation, since no custom TF op based CTC decoder found. #542

Open
HunbeomBak opened this issue Jun 16, 2020 · 0 comments

Comments

@HunbeomBak
Copy link

HunbeomBak commented Jun 16, 2020

When build language model, build_lm.py don't generate lm.binary file

for using decoder to my dataset, ctc decoder or greedy decoder need lm.binary.

i use docker from https://nvidia.github.io/OpenSeq2Seq/html/installation.html step7.

And aslo run ./scripts/install_kenlm.sh, scripts/install_decoders.sh

but build_lm.py don't generate lm.binary with below line.

kenlm/build/bin/lmplz --text /data/ASR_ATC/train_20200331.txt --arpa /data/ASR_ATC/train_20200331.arpa --o 5
=== 1/5 Counting and sorting n-grams ===
Reading /data/ASR_ATC/train_20200331.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100


/data/ASR_ATC/OpenSeq2Seq/kenlm/util/scoped.cc:20 in void* util::{anonymous}::InspectAddr(void*, std::size_t, const char*) threw MallocException because `!addr && requested'.
Cannot allocate memory for 44258881520 bytes in malloc
Aborted (core dumped)
kenlm/build/bin/build_binary trie -q 8 -b 7 -a 256 /data/ASR_ATC/train_20200331.arpa /data/ASR_ATC/train_20200331-lm.binary
Reading /data/ASR_ATC/train_20200331.arpa
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
End of file Byte: 0
ERROR
INFO: Skipping trie generation, since no custom TF op based CTC decoder found.
INFO: Please use Baidu CTC decoder with this language model.
root@62ed5592ecb7:/data/ASR_ATC/OpenSeq2Seq#

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant