Skip to content

phineas-pta/speech-synthesis-ngngngan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

speech-synthesis NgNgNgan

python script to download & process data to train a speech-synthesis model of Vietnamese M.C. Nguyễn Ngọc Ngạn

tải và xử lí audio để train neural network nhái giọng bác Ngạn

vì lí do bản quyền nên ở đây chỉ có code ko có data, ai muốn thì đọc hướng dẫn dưới đây để chạy code kéo audio về tự train

license

RVC checkpoints: https://huggingface.co/doof-ferb/rvc-ngngngan

Matcha-TTS checkpoints: https://huggingface.co/doof-ferb/matcha_ngngngan

Demo: Matcha-TTS 🤗 https://huggingface.co/spaces/doof-ferb/MatchaTTS_ngngngan

requirements

need NVIDIA GPU

install ffmpeg

git clone this repo

prepare a fresh python env (venv or conda)
pip install torch torchaudio --find-links=https://download.pytorch.org/whl/torch_stable.html
optional: pip install jupyter-lab tensorboard for visualization
e.g. tensorboard --logdir <path to folder containing events.out.tfevents.*>localhost:6006

or directly run pip install -r requirements.txt but it may not be up-to-date

workflow

Part 1: prepare data for RVC

Part 2: e.g. of RVC training + inference

Part 3: prepare data for text-to-speech

Part 4.1: e.g. VITS 2 training (GIVE UP because training too long)

Part 4.2: e.g. Matcha-TTS training

miscellaneous

git update-index --skip-worktree data/vits2_ngngngan_nosdp.json
git update-index --skip-worktree tensorboard/export_tensorboard_RVC.py
git update-index --skip-worktree tensorboard/export_tensorboard_MatchaTTS.py

About

python script to download & process data to train a speech-synthesis model of Vietnamese M.C. Nguyễn Ngọc Ngạn

Topics

Resources

License

Stars

Watchers

Forks

Languages