Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to check speaker disentanglement during training? #50

Open
lambda-delta34 opened this issue Aug 3, 2021 · 1 comment
Open

How to check speaker disentanglement during training? #50

lambda-delta34 opened this issue Aug 3, 2021 · 1 comment

Comments

@lambda-delta34
Copy link

What I have done: I purposely set a 0-like speaker embedding vector during testing for both image representation and loss measure (MSE, I assume higher is better).

For the result, I can clearly observe a significant MSE (around 33) after few days of training. However, after doing the real voice conversion (from one speaker to another), the model only achieves reconstruction without voice conversion.

If possible, it would be really appreciated knowing if there exist other ways to test voice conversion during training.

Great Thanks.

@auspicious3000
Copy link
Owner

Sorry I could not understand your question. For example, what is image representation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants