Nemo and W&B workshop

Here is the code for the workshop "Training and Tunning a Text to speech model with Nvidia NeMo and Weights and Biases".

Installation

If you are running in SageMaker you will have to choose an image with Pytorch and GPU capabilities.

Choose Pytorch image from the Environment drop down menu

Choose a GPU equiped machine

Open a terminal on the machine and clone this repo using git clone

Run the setup.sh script inside the Machine Image terminal.

> bash sm_setup.sh

Running the code

Follow the notebooks in order

01_log_datasets.ipynb will download the data and create a train/valid split. You will learn how to store your data on Weights and Biases.
02_FastPitch_finetune.ipynb You will fine tune a FastPitch model on a particular speaker. We will then analyse the results using wandb.Tabless
03_HiFIGAN_finetune.ipynb Improve previous results by finetunning the HiFiGan model with the data from the new speaker!
04_final_validation.ipynb Analyse the final results after both finetunnings.

Notes

You will need a machine with at least 16GB of VRAM. You can try decreasing batch size on smaller machines, but you will need to adjust the steps accordingly.
The checkpoints exp_diruse a lot of space, and also the ~/.cache gets heavy quickly. You can safely delete this as it will re-download if necessary.
The 9017 speaker data is really small, so the model performance is not very good.

Bring your own dataset

You can create a dataset of your own, also using NeMo ASR features.

Pitch metrics are needed from your data: script can help you out 😎
You will need to normalize your text inputs, check this notebooks.
The expected audio files need to be in 22050Hz as sample rate. You can export your data with from scipy.io.wavfile import write.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
conf		conf
images		images
tts_dataset_files		tts_dataset_files
01_log_datasets.ipynb		01_log_datasets.ipynb
02_FastPitch_finetune.ipynb		02_FastPitch_finetune.ipynb
03_HiFIGAN_finetune.ipynb		03_HiFIGAN_finetune.ipynb
04_final_validation.ipynb		04_final_validation.ipynb
LICENSE		LICENSE
README.md		README.md
W&B_ML_Workflows.pdf		W&B_ML_Workflows.pdf
fastpitch_finetune.py		fastpitch_finetune.py
hifigan_finetune.py		hifigan_finetune.py
sm_setup.sh		sm_setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

images

images

tts_dataset_files

tts_dataset_files

01_log_datasets.ipynb

01_log_datasets.ipynb

02_FastPitch_finetune.ipynb

02_FastPitch_finetune.ipynb

03_HiFIGAN_finetune.ipynb

03_HiFIGAN_finetune.ipynb

04_final_validation.ipynb

04_final_validation.ipynb

LICENSE

LICENSE

README.md

README.md

W&B_ML_Workflows.pdf

W&B_ML_Workflows.pdf

fastpitch_finetune.py

fastpitch_finetune.py

hifigan_finetune.py

hifigan_finetune.py

sm_setup.sh

sm_setup.sh

Repository files navigation

Nemo and W&B workshop

Installation

Running the code

Notes

Bring your own dataset

CopyRights are reserved by Thomas Capelle @ Weights Biases ML Engineer

About

Releases

Packages

Languages

License

aaaastark/NeMo-WeightsBiases-TTS

Folders and files

Latest commit

History

Repository files navigation

Nemo and W&B workshop

Installation

Running the code

Notes

Bring your own dataset

CopyRights are reserved by Thomas Capelle @ Weights Biases ML Engineer

About

Topics

Resources

License

Stars

Watchers

Forks

Languages