Citation

N-HANS is a Python toolkit for in-the-wild speech enhancement, including speech, music, and general audio denoising, separation, and selective noise or source suppression. The functionalities are realised based on two neural network models sharing the same architecture, but trained separately. The models are comprised of stacks of residual blocks, each conditioned on additional speech or environmental noise recordings for adapting to different unseen speakers or environments in real life.

pip3 install N-HANS

Please direct any questions or requests to Shuo Liu (shuo.liu@informatik.uni-augsburg.de).

Citation

If you use N-HANS or any code from N-HANS in your research work, you are kindly asked to acknowledge the use of N-HANS in your publications.

https://link.springer.com/article/10.1007%2Fs11042-021-11080-y

S. Liu, G. Keren, E. Parada-Cabaleiro, B. Schuller, "N-HANS: A neural network-based toolkit for in-the-wild audio enhancement," Multimedia Tools and Applications, 2021, accepted, 27 pages.

Prerequisites

Python 3 / Python 2.7

Python Dependencies

numpy >=1.14.5
scipy >=1.0.1
tensorflow/tensorflow-gpu >=1.14.0 or tensorflow >= 2.0

Usage

Loading Models

After pip3 install N-HANS, users are expexted to create a N-HANS folder for conducting audio denoising or separation tasks. For linux users, commands load_denoiser or load_separator will assist in downloading pretrained denoising and separation models, accompanied by some audio examples. The trained models and audio examples can also be found in the above N_HANS_Selective_Noise and N_HANS_Source_Separation folders, which provides users working on other operation systems the opportunity to apply N-HANS.

Applying N-HANS

N-HANS has been developed to process standard .wav audios with sample rate of 16kHz and coded in 16-bit Signed Integer PCM. With the embedded format converter written based on sox package, audio files of other formats are automatically to convert to this standard setting.

Commands

Task	Command	Discription
speech denoising	nhans_denoiser --input noisy.wav --output denoised.wav --neg noise.wav	--neg the environmental noise
selective noise suppresion	nhans_denoiser --input noisy.wav --output denoised.wav --pos preserve.wav --neg suppress.wav	--pos indicates the noise to be preserved --neg hints the noise to be suppressed
speech separation	nhans_separator --input mixed.wav --output separated.wav --pos target.wav --neg interference.wav	--pos indicates the target speaker --neg hints the interference speaker

Examples

Processing single wav sample

Task	Example
speech denoising	nhans_denoiser --input audio_examples/exp2_noisy.wav --output denoised.wav --neg audio_examples/exp2_noise.wav
selective noise suppresion	nhans_denoiser --input audio_examples/exp1_noisy.wav --output denoised.wav --pos audio_examples/exp1_posnoise.wav --neg audio_examples/exp2_negnoise.wav
speech separation	nhans_separator --input audio_examples/mixed.wav --output separated.wav --pos audio_examples/target_speaker.wav --neg audio_examples/noise_speaker.wav

Processing multiple wav samples in folders

Please create folders containing noisy, (positive) negative recordings, the recordings for each example in different folders should have identical filename.

Task	Example
speech denoising	nhans_denoiser --input audio_examples/noisy_dir --output denoised_dir --neg audio_examples/neg_dir
selective noise suppresion	nhans_denoiser --input audio_examples/noisy_dir --output denoised_dir --pos audio_examples/pos_dir --neg=audio_examples/neg_dir
speech separation	nhans_separator --input audio_examples/mixed_dir --output separated_dir --pos=audio_examples/target_dir --neg=audio_examples/interference_dir

Train your own N-HANS

You can train your own selective audio suppression system and separation system using N-HANS architecture based on this respository.

To train a selective audio suppression system, please go into N-HANS/N_HANS___Selective_Noise/ and create lists for clean speech samples and environment noises. Feed the paths of the folders that individually consists of speech .wav files and noise .wav files in create_seeds, which will generate two pickle files (.pkl) containing speech and noise wav files, separately. To maximally train a system that is consistent with our trained model, we provide the seed lists for the data split of AudioSet Corpus (https://research.google.com/audioset/) in our publication. To download AudioSet_seeds.

To train an speech separation system, please go into N-HANS/N_HANS___Speech_Separation/ and create a speech list using create_seeds direct to your folder containing speech .wav files, which will produce a .pkl file.
Run main.py script with your specifications indicated by FLAGS appear in the following table (default specifications were used to achieve our trained_models). The reader.py provides the training, validataion and test data pipeline and feeds the data to N-HANS neural networks constructed in main.py.

FLAGS	Default	Funcationalities
--speech_wav_dir	'./speech_wav_dir/'	the directory contains all speech .wav files
--noise_wav_dir	'./noise_wav_dir/'	the directory contains all noise .wav files
--wav_dump_folder	'./wav_dump/'	the directory to save denoised signals
--eval_seeds	'valid'	evaluation is applied for 'valid' dataset. In test, change it to 'test'
--window_frames	35	number of frames of input noisy signal
--context_frames	200	number of frames of reference context signal
--random_slices	50	number of random samples from each pair of clean speech and noise signal
--model_name	'nhans'	model name
--restore_path	''	the path to restore trained model
--alg	'sgd'	optimiser used to train N-HANS
--train_mb	64	mini-batch size for training data
--eval_mb	64	mini-batch size for validation or test data
--lr	0.1	learning rate
--mom	0.0	monentum for optimiser
--bn_decay	0.95	batch normalisation decay
--eval_before_training	False	Training phase: False, Test phase: True
--eval_after_training	True	Training phase: True, Test phase: False
--train_monitor_every	1000	show training information for each "train_monitor_every" batches
--eval_every	5000	show evaluation information for each "eval_every" training batches
--checkpoint_dir	'./checkpoints'	directory to save checkpoints
--summaries	'./summaries'	directory for summairies
--dump_results	'./dump'	directory for intermediate output of model during training

To test your model, restore_path is set to the trained models, --eval_seeds=test is required.

Authors and Contact Information

Shuo Liu (shuo.liu@informatik.uni-augsburg.de)
Gil Keren
Björn Schuller

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
DEMO_N-HANS		DEMO_N-HANS
N_HANS___Selective_Noise		N_HANS___Selective_Noise
N_HANS___Source_Separation		N_HANS___Source_Separation
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEMO_N-HANS

DEMO_N-HANS

N_HANS___Selective_Noise

N_HANS___Selective_Noise

N_HANS___Source_Separation

N_HANS___Source_Separation

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

setup.py

setup.py

Repository files navigation

Citation

Prerequisites

Python Dependencies

Usage

Loading Models

Applying N-HANS

Commands

Examples

Processing single wav sample

Processing multiple wav samples in folders

Train your own N-HANS

Authors and Contact Information

About

Releases

Packages

Languages

License

N-HANS/N-HANS

Folders and files

Latest commit

History

Repository files navigation

Citation

Prerequisites

Python Dependencies

Usage

Loading Models

Applying N-HANS

Commands

Examples

Processing single wav sample

Processing multiple wav samples in folders

Train your own N-HANS

Authors and Contact Information

About

Topics

Resources

License

Stars

Watchers

Forks

Languages