Skip to content
This repository has been archived by the owner on Nov 11, 2023. It is now read-only.

Tips on training #75

Open
carlosedubarreto opened this issue Mar 23, 2023 · 5 comments
Open

Tips on training #75

carlosedubarreto opened this issue Mar 23, 2023 · 5 comments

Comments

@carlosedubarreto
Copy link

I was making tests with this code and my experience is that training until 300 epochs brings terrible results, uncomprehensive sounds.

Its possible that the sound I'm using as input is the problem? Or by 300 epochs its normal to have only noise as result of the inference?

And what is the minimum to have something comprehensible as a result from the inference?

Thanks a lot for the help

@carlosedubarreto
Copy link
Author

Sending a feedback on my expoeriments. on epoch 2000 I start to get some littel understanding. look like the speaker is talking from very far.

I also noticed that other models train for about 190k epochs. So I fell that to make it better is just a matter of letting it train

@leng-yue
Copy link
Contributor

It depends on your dataset size and whether you use the pre-trained model. Generally, training HiFiGAN (vocoder) from scratch will take about 300k to 1 million steps to get a good result.

@carlosedubarreto
Copy link
Author

the pre trained model would be those files G_0.pth and D_0.pth?
and you raised another doubt. the saize of the dataset.
Do you know wha would be a good dataset size?

For testing I'm using 5 clips of 10 seconds, the same I used with Tortoise TTS, that brought a very good result on Tortoise.

Thanks a lot for the clarifications.

@leng-yue
Copy link
Contributor

Yes, those pths are the pre-trained models. 50s data is not useful for training from scratch.

@ne0escape
Copy link

Yes, those pths are the pre-trained models. 50s data is not useful for training from scratch.

How much data do you think is useful?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants