GitHub - L0SG/WaveFlow: A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio" (ICML 2020)

WaveFlow: A Compact Flow-based Model for Raw Audio

Update: Pretrained weights are now available. See links below.

This is an unofficial PyTorch implementation of WaveFlow (Ping et al, ICML 2020) model.

The aim for this repo is to provide easy-to-use PyTorch version of WaveFlow as a drop-in alternative to various neural vocoder models used with NVIDIA's Tacotron2 audio processing backend.

Please refer to the official implementation written in PaddlePaddle for the official results.

Setup

Clone this repo and install requirements

git clone https://github.com/L0SG/WaveFlow.git
cd WaveFlow
pip install -r requirements.txt

Install Apex for mixed-precision training

Train your model

Download LJ Speech Data. In this example it's in data/
Make a list of the file names to use for training/testing.
```
ls data/*.wav | tail -n+10 > train_files.txt
ls data/*.wav | head -n10 > test_files.txt
```
-n+10 and -n10 indicates that this example reserves the first 10 audio clips for model testing.
Edit the configuration file and train the model.

Below are the example commands using waveflow-h16-r64-bipartize.json
```
nano configs/waveflow-h16-r64-bipartize.json
python train.py -c configs/waveflow-h16-r64-bipartize.json
```
Single-node multi-GPU training is automatically enabled with DataParallel (instead of DistributedDataParallel for simplicity).

For mixed precision training, set "fp16_run": true on the configuration file.

You can load the trained weights from saved checkpoints by providing the path to checkpoint_path variable in the config file.

checkpoint_path accepts either explicit path, or the parent directory if resuming from averaged weights over multiple checkpoints.

Examples

insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize/waveflow_5000" in the config file then run
```
python train.py -c configs/waveflow-h16-r64-bipartize.json
```
for loading averaged weights over 10 recent checkpoints, insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize" in the config file then run
```
python train.py -a 10 -c configs/waveflow-h16-r64-bipartize.json
```
you can reset the optimizer and training scheduler (and keep the weights) by providing --warm_start
```
python train.py --warm_start -c configs/waveflow-h16-r64-bipartize.json
```
Synthesize waveform from the trained model.

insert checkpoint_path in the config file and use --synthesize to train.py. The model generates waveform by looping over test_files.txt.
```
python train.py --synthesize -c configs/waveflow-h16-r64-bipartize.json
```
if fp16_run: true, the model uses FP16 (half-precision) arithmetic for faster performance (on GPUs equipped with Tensor Cores).

Pretrained Weights

We provide pretrained weights via Google Drive. The models are trained for 5 M steps, then we averaged weights over 20 last checkpoints with -a 20. Audio quality almost matches the original paper.

Models	Download
waveflow-h16-r64-bipartize	Link
waveflow-h16-r128-bipartize	Link

Reference

NVIDIA Tacotron2: https://github.com/NVIDIA/waveglow

NVIDIA WaveGlow: https://github.com/NVIDIA/waveglow

r9y9 wavenet-vocoder: https://github.com/r9y9/wavenet_vocoder

FloWaveNet: https://github.com/ksw0306/FloWaveNet

Parakeet: https://github.com/PaddlePaddle/Parakeet

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
configs		configs
models		models
tacotron2_custom		tacotron2_custom
utils		utils
LICENSE		LICENSE
README.md		README.md
functions.py		functions.py
mel2samp.py		mel2samp.py
modules.py		modules.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

models

models

tacotron2_custom

tacotron2_custom

utils

utils

LICENSE

LICENSE

README.md

README.md

functions.py

functions.py

mel2samp.py

mel2samp.py

modules.py

modules.py

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

WaveFlow: A Compact Flow-based Model for Raw Audio

Update: Pretrained weights are now available. See links below.

Setup

Train your model

Examples

Pretrained Weights

Reference

About

Releases

Packages

Contributors 2

Languages

License

L0SG/WaveFlow

Folders and files

Latest commit

History

Repository files navigation

WaveFlow: A Compact Flow-based Model for Raw Audio

Update: Pretrained weights are now available. See links below.

Setup

Train your model

Examples

Pretrained Weights

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages