VoiceSplit

Pytorch unofficial implementation of VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

Final project for SCC5830- Image Processing @ ICMC/USP.

Dataset

For the task we intend to use the LibriSpeech dataset initially. However, to use it in this task, we need to generate audios with overlappings voices.

Improvements

We use Si-SNR with PIT instead of Power Law compressed loss, because it allows us to achieve a better result ( comparison available in: https://github.com/Edresson/VoiceSplit).
We used the MISH activation function instead of ReLU and this has improved the result

Report

You can see a report of what was done in this repository here

Demos

Colab notebooks Demos:

Exp 1: link

Exp 2: link

Exp 3: link

Exp 4: link

Exp 5 (best): link

Site demo for the experiment with best results (Exp 5): https://edresson.github.io/VoiceSplit/

ToDos:

Create documentation for the repository and remove unused code

Future Works

Train VoiceSplit model with GE2E3k and Mean Squared Error loss function

Acknowledgment:

In this repository it contains codes of other collaborators, the due credits were given in the used functions:

Preprocessing: Eren Gölge @erogol

VoiceFilter Model: Seungwon Park @seungwonpark

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
datasets		datasets
models		models
notebooks		notebooks
scripts		scripts
utils		utils
.gitignore		.gitignore
Final Report.pdf		Final Report.pdf
LICENSE		LICENSE
Partial Report.pdf		Partial Report.pdf
README.md		README.md
VoiceSplit presentation.pdf		VoiceSplit presentation.pdf
config.json		config.json
convert.py		convert.py
generator_paper.py		generator_paper.py
log-out.txt		log-out.txt
preprocess_by_csv.py		preprocess_by_csv.py
preprocess_by_csv_without_voice_overlay.py		preprocess_by_csv_without_voice_overlay.py
requirements.txt		requirements.txt
run_train.sh		run_train.sh
run_train.sh.save		run_train.sh.save
test.py		test.py
test_all_checkpoints.py		test_all_checkpoints.py
test_fast_all_checkpoints.py		test_fast_all_checkpoints.py
train.py		train.py

License

Edresson/VoiceSplit

Folders and files

Latest commit

History

Repository files navigation

VoiceSplit

Dataset

Improvements

Report

Demos

ToDos:

Future Works

Acknowledgment:

About

Resources

License

Stars

Watchers

Forks

Languages