DEMUCS-for-Speech-Enhancement

Welcome to the DEMUCS-for-Speech-Enhancement repository.

DEMUCS is a source separation model proposed by Facebook (now META), which received great attention for its fast processing speed and excellent performance. It was later applied to the field of speech enhancement and showed excellent performance [1]. This repository provides the following research content:

Implementation of HD-DEMUCS[2]
DEMUCS in the time-frequency domain
HD-DEMUCS in time-frequency domain

Performance is provided at the end of the README, and as a result, you can check the performance comparison in HD-DEMUCS and the Time-frequency domain.

Update

2023.11.06

Requirements

This repo is implemented in Ubuntu 22.04, PyTorch 2.0.1, Python3.10, and CUDA11.7. For package dependencies, you can install them by:

pip install -r requirements.txt

Dataset Installation

To get started with the DEMUCS-for-Speech-Enhancement project, the first step is to set up the dataset which will be used to train and evaluate the model. This project uses a combination of the Voice Bank corpus and DEMAND database

Voice Bank + DEMAND Dataset: The dataset combines clean speech from the Voice Bank corpus and various types of noise from the DEMAND database to simulate realistic noisy speech conditions.

Download: https://datashare.ed.ac.uk/handle/10283/1942

Getting Started

Install the necessary libraries.
Set directory paths for your dataset. (options.py)

# dataset path
noisy_dirs_for_train = '../Dataset/train/noisy/'   
noisy_dirs_for_valid = '../Dataset/valid/noisy/'

Run train_interface.py

Architecture

Results

References

[1] Defossez, Alexandre, Gabriel Synnaeve, and Yossi Adi. "Real time speech enhancement in the waveform domain." arXiv preprint arXiv:2006.12847 (2020). [paper] [code]
[2] Kim, Doyeon, et al. "HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders." arXiv preprint arXiv:2306.01411 (2023). [paper]

Contact

E-mail: jbcha7@yonsei.ac.kr

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
__pycache__		__pycache__
dataloader		dataloader
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
options.py		options.py
requirements.txt		requirements.txt
test_interface.py		test_interface.py
train_interface.py		train_interface.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

dataloader

dataloader

models

models

utils

utils

LICENSE

LICENSE

README.md

README.md

options.py

options.py

requirements.txt

requirements.txt

test_interface.py

test_interface.py

train_interface.py

train_interface.py

Repository files navigation

DEMUCS-for-Speech-Enhancement

Update

Requirements

Dataset Installation

Getting Started

Architecture

Results

References

Contact

About

Releases

Packages

Languages

License

JaeBinCHA7/DEMUCS-for-Speech-Enhancement

Folders and files

Latest commit

History

Repository files navigation

DEMUCS-for-Speech-Enhancement

Update

Requirements

Dataset Installation

Getting Started

Architecture

Results

References

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages