On what kind of datasets does the model trained on? #38

sraghuram90 · 2020-10-15T07:37:17Z

What are the datasets does this kaldi active grammar model trained on?
If you would have included public datasets, could you name them?
The pretrained model which you mentioned, is that Zamia speech model?

JohnDoe02 · 2020-10-20T22:00:48Z

I was also curious about this. According to here (cf., stage 2) it should be: Librispeech, TEDLIUM, Mozilla's Commonvoice, Tatoeba, Tensorflow's speech_commands.

daanzu · 2020-11-01T06:20:08Z

Actually, daanzu_multi_en is a partial and unfinished training pipeline. I have ended up working with a heavily modified version of the Zamia pipeline. The datasets are:

Common Voice
Common Voice single word
Librispeech
LJ Speech
M-AILabs
Google Speech Commands
Tatoeba
TedLIUM3
Voxforge
A collection of TTS I generated

zhouyong64 · 2021-12-09T11:51:40Z

Actually, daanzu_multi_en is a partial and unfinished training pipeline. I have ended up working with a heavily modified version of the Zamia pipeline. The datasets are:

Common Voice

Common Voice single word

Librispeech

LJ Speech

M-AILabs

Google Speech Commands

Tatoeba

TedLIUM3

Voxforge

A collection of TTS I generated

How about kaldi_model_daanzu_20211030-biglm? Also trained on these datasets?

daanzu · 2021-12-12T06:55:32Z

@zhouyong64

How about kaldi_model_daanzu_20211030-biglm? Also trained on these datasets?

Yes, the new model is trained on the same datasets. The major change is that it now includes models necessary for running g2p_en for local pronunciation generation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On what kind of datasets does the model trained on? #38

On what kind of datasets does the model trained on? #38

sraghuram90 commented Oct 15, 2020

JohnDoe02 commented Oct 20, 2020

daanzu commented Nov 1, 2020

zhouyong64 commented Dec 9, 2021

daanzu commented Dec 12, 2021

On what kind of datasets does the model trained on? #38

On what kind of datasets does the model trained on? #38

Comments

sraghuram90 commented Oct 15, 2020

JohnDoe02 commented Oct 20, 2020

daanzu commented Nov 1, 2020

zhouyong64 commented Dec 9, 2021

daanzu commented Dec 12, 2021