Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to stitch together several preexisting G.fst with KAG #61

Open
lormaechea opened this issue Aug 20, 2021 · 2 comments
Open

How to stitch together several preexisting G.fst with KAG #61

lormaechea opened this issue Aug 20, 2021 · 2 comments

Comments

@lormaechea
Copy link

Hi David:

First of all, I would like to express my recognition for your excellent work! I take the liberty of contacting you because I would kindly like to ask you a question about your project.

I am currently working on a Speech-to-Text translation system, whose goal is to recognize spoken French and translate it into different languages. Among my tasks, I am in charge of creating and improving the ASR module. I have already implemented several prototypes based on the "regular" Kaldi online, and have used different language models configurations (ranging from context specific to more generic interpolated models), but I would like to implement your framework so as to stitch my different LMs together and dynamically activate them at decode time.

On the basis of the code made available in your repository and the issues tab, I have been able to create and compile my own AGF custom models for French. I have tested them and they appear to be well performing and functional (I can provide more information on this if interested).

However, I am not completely sure if it is possible to directly import other language models (already compiled with regular Kaldi). In "full_example.py" you define specific rules that you later compile in fsts, but I was wondering if it would be possible to integrate a previously compiled grammar (G.fst) into KAG. Could you give me some ideas/code snippets on this point?

Thanks in advance for your time and attention.

Best regards,

Lucía

@daanzu
Copy link
Owner

daanzu commented Aug 29, 2021

@lormaechea Thanks! Your project sounds interesting, and I am happy to hear that it works well French. Any instructions that you could write up would likely help others.

Yes, using multiple any G.fst should be possible, although it is not something I directly designed for. It is easy to choose any single G.fst to be used as the single Dictation grammar. But I mostly designed for all of the other grammars to be built through the KaldiAG API. There are, however, functions built in that should allow you to load any FST file directly to be used as a grammar, although I use them mostly for testing and debugging. You should find them in the FST module file in the kaldi-fork, and in the NativeWFST module file of KaldiAG. Let me know if you need more tips.

@lormaechea
Copy link
Author

Hi again @daanzu! Thanks for your response and for the help provided. I will explain my own procedure for creating custom models in #39.

I'm trying to add a second G.fst grammar as part of my KaldiAG custom model, but I wonder whether I'm doing the right thing or if there is still more steps to go through.

I first added the NativeWFST class inside the __init__.py file:

from .wfst import WFST, NativeWFST

I later set up my French custom model (which I already compiled using compile_agf_dictation_graph):

compiler = kaldi_active_grammar.Compiler(model_dir=model_dir, tmp_dir=tmp_dir)
compiler.fst_cache.invalidate()
decoder = compiler.init_decoder(dictation_fst_file=model_dir+"Dictation.fst")

And I finally loaded my second grammar using:

test = kaldi_active_grammar.NativeWFST.load_file("grammar.fst")

Will this do the trick? When I go check the tmp directory, there is just one fst file (I wonder if there should be 2, according to the loaded files).

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants