Finetuning a Grad-TTS model on a small dataset? #21

godspirit00 · 2022-09-06T03:44:43Z

Hi! Thanks for the great work!
I have a small dataset of ~2 hours.
How do I finetune a Grad-TTS model on it?
Thanks!

Misterion777 · 2022-10-28T09:32:47Z

Hi, same question, is it possible?

li1jkdaw · 2023-05-14T15:26:32Z

Hi, @godspirit00, @Misterion777!
Yes, it is possible. You can either start from our checkpoint trained on LibriTTS and then fine-tune it on your voice (check #8 for more details), or train Grad-TTS on some large multi-speaker dataset (check #23 for details) and then fine-tune it. If your dataset is in English, then the first option may be the easiest (and around 10 minutes of the target voice will be perhaps enough). If the language is different, then the second may be better (i.e. you can find some large open-source dataset in this language, train Grad-TTS on it, and then fine-tune it on your target voice).
In any case, don't forget to use either the pre-trained universal HiFi-GAN vocoder or fine-tune it to your voice for better sound quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning a Grad-TTS model on a small dataset? #21

Finetuning a Grad-TTS model on a small dataset? #21

godspirit00 commented Sep 6, 2022

Misterion777 commented Oct 28, 2022

li1jkdaw commented May 14, 2023 •

edited

Finetuning a Grad-TTS model on a small dataset? #21

Finetuning a Grad-TTS model on a small dataset? #21

Comments

godspirit00 commented Sep 6, 2022

Misterion777 commented Oct 28, 2022

li1jkdaw commented May 14, 2023 • edited

li1jkdaw commented May 14, 2023 •

edited