Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning a Grad-TTS model on a small dataset? #21

Open
godspirit00 opened this issue Sep 6, 2022 · 2 comments
Open

Finetuning a Grad-TTS model on a small dataset? #21

godspirit00 opened this issue Sep 6, 2022 · 2 comments

Comments

@godspirit00
Copy link

Hi! Thanks for the great work!
I have a small dataset of ~2 hours.
How do I finetune a Grad-TTS model on it?
Thanks!

@Misterion777
Copy link

Hi, same question, is it possible?

@li1jkdaw
Copy link

li1jkdaw commented May 14, 2023

Hi, @godspirit00, @Misterion777!
Yes, it is possible. You can either start from our checkpoint trained on LibriTTS and then fine-tune it on your voice (check #8 for more details), or train Grad-TTS on some large multi-speaker dataset (check #23 for details) and then fine-tune it. If your dataset is in English, then the first option may be the easiest (and around 10 minutes of the target voice will be perhaps enough). If the language is different, then the second may be better (i.e. you can find some large open-source dataset in this language, train Grad-TTS on it, and then fine-tune it on your target voice).
In any case, don't forget to use either the pre-trained universal HiFi-GAN vocoder or fine-tune it to your voice for better sound quality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants