AutoVC

Requirements

Currently uploaded code is compatible with the following datasets.
The O mark to the left of the dataset name is the dataset actually used in the uploaded result.

	Dataset	Dataset address
O	VCTK	https://datashare.is.ed.ac.uk/handle/10283/2651
O	LibriTTS	https://openslr.org/60/
X	CMU Arctic	http://www.festvox.org/cmu_arctic/index.html
X	VoxCeleb1	http://www.robots.ox.ac.uk/~vgg/data/voxceleb/
X	VoxCeleb2	http://www.robots.ox.ac.uk/~vgg/data/voxceleb/

Before proceeding, please set the pattern, inference, and checkpoint paths in 'Hyper_Parameter.yaml' according to your environment.

Sound
- Setting basic sound parameters.
Content_Encoder
- Setting the parameters of content encoder.
Style_Encoder
- Setting the parameters of style encoder.
- Encoder is a pre-trained speaker embedding model.
  - https://github.com/CODEJIN/Speaker_Embedding_Torch
- All parameters must be matched to pre-trained speaker embedding.
Decoder
- Setting the parameters of decoder.
Postnet
- Setting the parameters of convolution postnet.
WaveNet
- Setting the parameters of Vocoder.
- This implementation uses a pre-trained Parallel WaveGAN model.
  - https://github.com/CODEJIN/PWGAN_Torch
- If checkpoint path is null, model does not exports wav files.
- If checkpoint path is not null, all parameters must be matched to pre-trained Parallel WaveGAN model.
Train
- Setting the parameters of training.
- When the number of speaekrs in your train dataset is small, I recommend to increase the Train_Pattern/Accumulated_Dataset_Epoch.
Inference_Path
- Setting the inference path
Checkpoint_Path
- Setting the checkpoint path
Log_Path
- Setting the tensorboard log path
Device
- Setting which GPU device is used in multi-GPU enviornment.
- Or, if using only CPU, please set '-1'.

python Pattern_Generate.py [parameters]

At least, one or more of datasets must be used.

-vctk
- Set the path of VCTK. VCTK's patterns are generated.
-vc1
- Set the path of VoxCeleb1. VoxCeleb1's patterns are generated.
-vc2
- Set the path of VoxCeleb2. VoxCeleb2's patterns are generated.
-libri
- Set the path of LibriTTS. LibriTTS's patterns are generated.
-cmua
- Set the path of CMU Arctic. CMU Arctic's patterns are generated.
-vc1t
- Set the path of VoxCeleb1 testset. VoxCeleb1's patterns are generated for an evaluation.
-mw
- The number of threads used to create the pattern

python Train.py -s <int>

-s <int>
- The resume step parameter.
- Default is 0.
- When this parameter is 0, model try to find the latest checkpoint in checkpoint path.

Please refer the demo site:

https://codejin.github.io/AutoVC_Demo

This is the checkpoint of ? steps of 2 batchs (? epochs).

Checkpoint link

Hyperparameter link

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.vscode		.vscode
PWGAN @ 6bd7b7d		PWGAN @ 6bd7b7d
Style_Encoder @ 426500b		Style_Encoder @ 426500b
Wav_for_Inference		Wav_for_Inference
.gitignore		.gitignore
.gitmodules		.gitmodules
Audio.py		Audio.py
Datasets.py		Datasets.py
Datasets_bak.py		Datasets_bak.py
Hyper_Parameter.yaml		Hyper_Parameter.yaml
LICENSE		LICENSE
Modules.py		Modules.py
Noam_Scheduler.py		Noam_Scheduler.py
Pattern_Generator.py		Pattern_Generator.py
README.md		README.md
Radam.py		Radam.py
Train.py		Train.py
Wav_Path_for_Inference.txt		Wav_Path_for_Inference.txt
run_R.ps1		run_R.ps1
vctk_nonoutlier.txt		vctk_nonoutlier.txt
yin.py		yin.py