GitHub - himajin2045/voice-conversion: Voice Conversion pipeline consisting of GE2E speaker encoder, AutoVC conversion model and MelGAN vocoder.

Voice Conversion

This is a collection of speech models including a speaker encoder model, a voice conversion model and a vocoder model together as a complete Voice Conversion pipeline.

The sepaker encoder model is an implementation make use of the GE2E loss and the code is from https://github.com/CorentinJ/Real-Time-Voice-Cloning

The voice conversion model is AutoVC and the code is from https://github.com/auspicious3000/autovc

The vocoder model is https://github.com/descriptinc/melgan-neurips

You must make appropriate changes (e.g. change the path of datasets, the model parameters in hparams.py or the train_xxx.py files) in order to run the code. I'm not going to explain the code since it's almost been a year since the last time I run the code, I don't remember the details :)

The workflow is quite simple though:

You collect some speech datasets (see dataset.py)
Then run the preprocess.py script to convert raw audios to mel features so we don't need to do the conversion on the fly while training
Train the speaker encoder model
Train the vocoder model
Train the voice conversion model, this depends on a trained speaker encoder model
Run the inference function in the train_vc.py script to do the conversion, this depends on all 3 models

The speaker encoder model and the vocoder model should be trained on a large dataset combined from many corpus (see dataset.py for all corpus I used).

The voice conversion model could be trained on a small number of speakers from one of the corpus, 120 speakers and 120 utterances per speaker is good enough to get sound performance.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
dsp.py		dsp.py
hparams.py		hparams.py
model_vc.py		model_vc.py
model_vocoder.py		model_vocoder.py
preprocess.py		preprocess.py
speaker_encoder.py		speaker_encoder.py
train_speaker_encoder.py		train_speaker_encoder.py
train_vc.py		train_vc.py
train_vocoder.py		train_vocoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

dataset.py

dataset.py

dsp.py

dsp.py

hparams.py

hparams.py

model_vc.py

model_vc.py

model_vocoder.py

model_vocoder.py

preprocess.py

preprocess.py

speaker_encoder.py

speaker_encoder.py

train_speaker_encoder.py

train_speaker_encoder.py

train_vc.py

train_vc.py

train_vocoder.py

train_vocoder.py

Repository files navigation

Voice Conversion

About

Releases

Packages

Languages

himajin2045/voice-conversion

Folders and files

Latest commit

History

Repository files navigation

Voice Conversion

About

Topics

Resources

Stars

Watchers

Forks

Languages