A Text-to-Speech Pipeline, Evaluation Methodology and Initial Fine-Tuning Results for Child Speech Synthesis

Any research related meterial will be made available in this repository.

Training from scratch

Our codebase is a modified version of https://github.com/CorentinJ/Real-Time-Voice-Cloning. We have modified the code to make it usable with the MyST dataset as mentioned in our paper. Therefore, we advise following their repository as they have already provided detailed instructions (https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training) on setting up the environment and have big community support for any issues you might encounter might already be resolved there.

Using pretrained checkpoints

For any user who would like to reproduce our results, we provide the pretrained checkpoints for our models trained with children’s speech. They can use our checkpoint directly with the https://github.com/CorentinJ/Real-Time-Voice-Cloning/ for producing synthetically generated TTS Voices.

Note:

Some of the helpful and essential scripts will be made available to be used with the MyST dataset for cleaning and finetuning usage. The project is still under development, so the full codebase will be made available with time.

Users can take their own MyST license from https://catalog.ldc.upenn.edu/LDC2021S05.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A Text-to-Speech Pipeline, Evaluation Methodology and Initial Fine-Tuning Results for Child Speech Synthesis

Training from scratch

Using pretrained checkpoints

Note:

Files

README.md

Latest commit

History

README.md

File metadata and controls

A Text-to-Speech Pipeline, Evaluation Methodology and Initial Fine-Tuning Results for Child Speech Synthesis

Training from scratch

Using pretrained checkpoints

Note: