Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 1.32 KB

README.md

File metadata and controls

16 lines (9 loc) · 1.32 KB

A Text-to-Speech Pipeline, Evaluation Methodology and Initial Fine-Tuning Results for Child Speech Synthesis

Any research related meterial will be made available in this repository.

Training from scratch

Our codebase is a modified version of https://github.com/CorentinJ/Real-Time-Voice-Cloning. We have modified the code to make it usable with the MyST dataset as mentioned in our paper. Therefore, we advise following their repository as they have already provided detailed instructions (https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training) on setting up the environment and have big community support for any issues you might encounter might already be resolved there.

Using pretrained checkpoints

For any user who would like to reproduce our results, we provide the pretrained checkpoints for our models trained with children’s speech. They can use our checkpoint directly with the https://github.com/CorentinJ/Real-Time-Voice-Cloning/ for producing synthetically generated TTS Voices.

Note:

Some of the helpful and essential scripts will be made available to be used with the MyST dataset for cleaning and finetuning usage. The project is still under development, so the full codebase will be made available with time.

Users can take their own MyST license from https://catalog.ldc.upenn.edu/LDC2021S05.