Reproducing paper results #39

sathibault · 2022-09-18T12:35:54Z

I'm unable to train a working monolingual embedding model. Using the provided script (train_monolingual_embedding.py) with the top 165 English words yields the following results at the end of training:
loss: 0.7145 - accuracy: 0.7711 - val_loss: 7.6774 - val_accuracy: 0.0586

Based on the paper, I was expecting something in the range of 70's for validation accuracy. Is it dependent on choosing the "right" words?

Could you please post a tutorial or maybe some of the missing files (e.g. train_files.txt, val_files.txt, test_files.txt, commands.txt) for reproducing the embedding?

I also notice that the file references seem to be to common voice rather than MSW. I'm using the English clips download from MSW which I'm assuming are the same. I've converted these to 16KHz, 16bit wav files using pydub which I guess is ffmpeg under the hood.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing paper results #39

Reproducing paper results #39

sathibault commented Sep 18, 2022

Reproducing paper results #39

Reproducing paper results #39

Comments

sathibault commented Sep 18, 2022