How to use dvector_create.py #34

zeyuanchen23 · 2019-05-13T02:08:05Z

Hi!

Could you please explain how to run dvector_create.py on the TIMIT dataset?

This program tries to load some .wav files (line 91). However, the original data in TIMIT are .WAV files. After preprocessing, they are converted to .npy files. But where to find .wav files?

Thanks!

wrongbattery · 2019-05-13T10:08:54Z

Actually, .WAV file is a spp file. You need to write a code to convert it to .wav file. But the main problem is that, the preprocessing audio input for training embedding differs from the way it creating D-vector (dvector_create.py) for speaker diarization. Also, TIMIT dataset have no meaning for speaker diarization problem since all files contain only 1 speaker.

zeyuanchen23 · 2019-05-16T18:02:00Z

Actually, .WAV file is a spp file. You need to write a code to convert it to .wav file. But the main problem is that, the preprocessing audio input for training embedding differs from the way it creating D-vector (dvector_create.py) for speaker diarization. Also, TIMIT dataset have no meaning for speaker diarization problem since all files contain only 1 speaker.

Thanks for your reply. @wrongbattery I saw that the preprocessing for training and d-vector creation are different.
About TIMIT, the website says TIMIT contains broadband recordings of 630 speakers of eight major dialects of American English, each reading ten phonetically rich sentences. Does it mean that a model trained on TIMIT can be used for the Diarization task on other datasets (e.g. AMI )?

wrongbattery · 2019-05-17T08:22:26Z

Diarization task need a good embedding to perform well, from the Uirnn authors they said it needs at least 5k speaker to have a good embeddings. I already use TIMIT for training and do Diarization task on AMI dataset and the results are very pure compare to Pyannote repo. Seem both Uirnn and this repo are incomplete version so you need to implement lots of function to train on new dataset.

nidhal1231 · 2019-05-20T15:36:43Z

@wrongbattery d-vectors with dimension [N, 256] with N the number of sliding windows should be the input of uis-rnn (train-sequence) but the problem is that the cluster-id in d-vector should be extracted from labels of the dataset

pravn · 2019-08-20T22:38:46Z

I have a workaround for the wav creator for TIMIT, which I have put up in my forked repo.
https://github.com/pravn/PyTorch_Speaker_Verification/blob/master/VAD_segments.py

In VAD create, for TIMIT, it can complain that the RIFF headers aren't alright, so we rewrite the file into 'tmp.wav' and work from there.

'''
try:
file = path
wave.open(path,'rb')

except wave.Error:
    #print('Exception')
    #print('========')
    tmp, _ = librosa.load(path, sr)
    sf.write('tmp.wav', tmp, sr)
    file = 'tmp.wav'

'''

chrisspen · 2019-10-16T16:51:09Z

Has anyone figured this out? @pravn's change allowed me to generate the .npy files for the TIMIT dataset. But if I put them into corresponding training and testinig npz files, and run them with the demo.py file in the uis-rnn repo, it trains fine, but then fails on testing, saying the testing data is in the wrong format.

It looks like dvector_create.py outputs a 1-d array of floats for each testing data point but uis-rnn expects a 2d array. Am I missing something?

008karan · 2020-02-01T08:33:15Z

@wrongbattery
Have you trained the speaker diarization model using UISRNN completely? If yes pls enlighten us as there is a lot of confusion going on.

Can you describe what all pieces are missing and dataset requirement and anything else which will be helpful for others...

Cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use dvector_create.py #34

How to use dvector_create.py #34

zeyuanchen23 commented May 13, 2019

wrongbattery commented May 13, 2019

zeyuanchen23 commented May 16, 2019 •

edited

wrongbattery commented May 17, 2019

nidhal1231 commented May 20, 2019

pravn commented Aug 20, 2019

chrisspen commented Oct 16, 2019

008karan commented Feb 1, 2020 •

edited

How to use dvector_create.py #34

How to use dvector_create.py #34

Comments

zeyuanchen23 commented May 13, 2019

wrongbattery commented May 13, 2019

zeyuanchen23 commented May 16, 2019 • edited

wrongbattery commented May 17, 2019

nidhal1231 commented May 20, 2019

pravn commented Aug 20, 2019

chrisspen commented Oct 16, 2019

008karan commented Feb 1, 2020 • edited

zeyuanchen23 commented May 16, 2019 •

edited

008karan commented Feb 1, 2020 •

edited