How to create the dataset #11

pankaj2701 · 2018-05-21T03:21:19Z

I am wondering how to create the training dataset. I have understood the format but don't know how to create one. Do we have to manually annotate the training dataset. Manually annotation would be difficult, is there any utility which can create an approximate training data which can then refined manually.

pankaj2701 · 2018-05-21T03:24:58Z

Manual annotation of data with music, noise and speech mixed would be difficult.
Can we provide data with noise, speech or music separately. Labelling would be easier in this case.

jtkim-kaist · 2018-05-22T12:08:23Z

Our provided recorded dataset was manually annotated by two professional engineers.

If you want to construct the training set, do the followings:

run the VAD to the clean speech to get the label.
Add the noise to that clean speech using FanT tool or voice box(Matlab implemented)
Then you can get the labels with noisy speech

If you don't have clean speech manual annotation cannot be avoidable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create the dataset #11

How to create the dataset #11

pankaj2701 commented May 21, 2018

pankaj2701 commented May 21, 2018

jtkim-kaist commented May 22, 2018

How to create the dataset #11

How to create the dataset #11

Comments

pankaj2701 commented May 21, 2018

pankaj2701 commented May 21, 2018

jtkim-kaist commented May 22, 2018