Additional questions #49

Minseung-Kim · 2022-02-04T05:09:00Z

Hello again,

I am trying to reproduce the deepxi framework in the torch (tensorflow is not so familiar to me.. lol) and have some questions.

The Demand voicebank (valentini) dataset provides a training set in the form of (noisy, clean) pairs for each utterance.

When we subtract the clean from noisy, we can get the corresponding noise signal.

For the Demand voicebank dataset, did you use only those dataset pairs (they were provided)? or an additional clean or noise dataset?

In my previous question, you said that the noise recording used to corrupt the clean speech is randomly selected. (this imply noise recording should be longer than clean speech)

If then, Could you tell me how kind of additional noise recording did you use? and Have you used additional clean speech other than provided in Demand voicebank dataset?

In the training step, Deepxi uses both the training set and validation set.

As far as I know, the validation set is often used for early stopping. Is the validation set in deepxi framework also be used for this purpose?

Could you explain to me how the validation set was used?

Thank you!

anicolson · 2022-02-04T22:37:53Z

Hi Minseung-Kim,

For 1):

That is indeed how we get the noise for DEMAND-VB, and we only use the noise from DEMAND-VB (no external noise set is used).

Any one of those noise samples can then be used to corrupt a clean speech recording. i.e., we no longer treat them as clean speech and noise pairs, we treat them as two independent sets (this is only for the training set only).

If you look at

DeepXi/deepxi/sig.py

Line 258 in aad965d

def add_noise(self, s, d, s_len, d_len, snr):

,

A noise sample is randomly selected to corrupt a clean speech recording (only if its length is equal to or greater than the clean speech recording). If a noise sample does not meet this condition, another noise sample is selected. This continues until a noise sample is randomly selected that meets the condition.

For 2):

If I was re-implementing this framework now, I would certainly use early stopping. But, back in 2019, a maximum amount of epochs was specified, and the epoch that attained the highest validation scores was selected as the epoch to be tested.

I hope this helps, please let me know if something I said is not clear.

anicolson · 2022-02-04T22:42:20Z

On a side note, I am also using PyTorch and PyTorch Lightning now, let me know if you are interested in helping to update this repository to something PyTorch based :)

Minseung-Kim · 2022-02-05T09:57:19Z

Thank you for the reply! Now I understand.
DEMAND-VB has two types of the training set, 28spk and 56spk.
What type of set did you use?

anicolson · 2022-02-06T10:18:31Z

We have only used the 28 speaker version.

Minseung-Kim · 2022-02-06T18:31:14Z

Oh, thank you for the response.
In 28 speaker case, Since DEMAND-VB dataset doesn't provide a separate validation set, is it OK to set utterances of 2 spks (e.g., p.286 and p.287) of 28 as a validation set while rest 26 spks of 28 as a training set?

Or, is there any different way to set a validation set (in your experience)?

anicolson · 2022-02-06T22:20:38Z

Hi Minseung-Kim,

Using two of the speakers for the validation set has been the standard way. I have not personally seen it done another way :)

zuowanbushiwo · 2022-09-14T11:58:20Z

@anicolson
Where can I download DEMAND_VB Dataset, like deep_xi_dataset.zip. Is there a script for preparing the DEMAND_VB data ?
Thanks！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional questions #49

Additional questions #49

Minseung-Kim commented Feb 4, 2022

anicolson commented Feb 4, 2022

anicolson commented Feb 4, 2022

Minseung-Kim commented Feb 5, 2022

anicolson commented Feb 6, 2022

Minseung-Kim commented Feb 6, 2022

anicolson commented Feb 6, 2022

zuowanbushiwo commented Sep 14, 2022

Additional questions #49

Additional questions #49

Comments

Minseung-Kim commented Feb 4, 2022

anicolson commented Feb 4, 2022

anicolson commented Feb 4, 2022

Minseung-Kim commented Feb 5, 2022

anicolson commented Feb 6, 2022

Minseung-Kim commented Feb 6, 2022

anicolson commented Feb 6, 2022

zuowanbushiwo commented Sep 14, 2022