Multi-band MelGAN and Full band MelGAN

Unofficial PyTorch implementation of Multi-Band MelGAN paper. This implementation uses Seungwon Park's MelGAN repo as a base and PQMF filters implementation from this repo.
MelGAN :
Multi-band MelGAN:

Prerequisites

Tested on Python 3.6

pip install -r requirements.txt

Prepare Dataset

Download dataset for training. This can be any wav files with sample rate 22050Hz. (e.g. LJSpeech was used in paper)
preprocess: python preprocess.py -c config/default.yaml -d [data's root path]
Edit configuration yaml file

Train & Tensorboard

python trainer.py -c [config yaml file] -n [name of the run]
- cp config/default.yaml config/config.yaml and then edit config.yaml
- Write down the root path of train/validation files to 2nd/3rd line.
- Each path should contain pairs of *.wav with corresponding (preprocessed) *.mel file.
- The data loader parses list of files within the path recursively.
- For Multi-Band training use config/mb_melgan config file in -c
tensorboard --logdir logs/

Pretrained model

Check out here.

Inference

python inference.py -p [checkpoint path] -i [input mel path]

Results

References

License

BSD 3-Clause License.

utils/stft.py by Prem Seetharaman (BSD 3-Clause License)
datasets/mel2samp.py from https://github.com/NVIDIA/waveglow (BSD 3-Clause License)
utils/hparams.py from https://github.com/HarryVolek/PyTorch_Speaker_Verification (No License specified)

Useful resources

How to Train a GAN? Tips and tricks to make GANs work by Soumith Chintala
Official MelGAN implementation by original authors
Reproduction of MelGAN - NeurIPS 2019 Reproducibility Challenge (Ablation Track) by Yifei Zhao, Yichao Yang, and Yang Gao
- "replacing the average pooling layer with max pooling layer and replacing reflection padding with replication padding improves the performance significantly, while combining them produces worse results"

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
assets		assets
config		config
datasets		datasets
model		model
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
denoiser.py		denoiser.py
discriminator.txt		discriminator.txt
generator.txt		generator.txt
inference.py		inference.py
melgan_infer.ipynb		melgan_infer.ipynb
multi-band-melgan.txt		multi-band-melgan.txt
preprocess.py		preprocess.py
requirements.txt		requirements.txt
trainer.py		trainer.py

License

rishikksh20/melgan

Folders and files

Latest commit

History

Repository files navigation

Multi-band MelGAN and Full band MelGAN

Prerequisites

Prepare Dataset

Train & Tensorboard

Pretrained model

Inference

Results

References

License

Useful resources

About

Topics

Resources

License

Stars

Watchers

Forks

Languages