How to check speaker disentanglement during training? #50

lambda-delta34 · 2021-08-03T19:14:50Z

What I have done: I purposely set a 0-like speaker embedding vector during testing for both image representation and loss measure (MSE, I assume higher is better).

For the result, I can clearly observe a significant MSE (around 33) after few days of training. However, after doing the real voice conversion (from one speaker to another), the model only achieves reconstruction without voice conversion.

If possible, it would be really appreciated knowing if there exist other ways to test voice conversion during training.

Great Thanks.

auspicious3000 · 2021-08-03T22:53:13Z

Sorry I could not understand your question. For example, what is image representation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to check speaker disentanglement during training? #50

How to check speaker disentanglement during training? #50

lambda-delta34 commented Aug 3, 2021

auspicious3000 commented Aug 3, 2021

How to check speaker disentanglement during training? #50

How to check speaker disentanglement during training? #50

Comments

lambda-delta34 commented Aug 3, 2021

auspicious3000 commented Aug 3, 2021