Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About additional loss #3

Open
p0p4k opened this issue Nov 7, 2023 · 3 comments
Open

About additional loss #3

p0p4k opened this issue Nov 7, 2023 · 3 comments

Comments

@p0p4k
Copy link

p0p4k commented Nov 7, 2023

Hello, nice work. I have a question.
Q) how about adding an extra loss at the end of generation to match the spk_enc of reference wav and generated wav? Because I do not see meta-stylespeech's discriminator being used here? (am i missing it somewhere?)

....
s_ref = self.spk_enc(y.transpose(1,2), (y_mask==0).squeeze(1))
....
## freeze spk_enc
s_out = self.spk_enc(y_out.transpose(1,2), (y_out_mask==0).squeeze(1))
# then cosine dist b/w s_ref and s_out
## unfreeze spk_enc

Thanks.

@hcy71o
Copy link
Owner

hcy71o commented Nov 8, 2023

This repo has built on Pure VITS & StyleSpeech, in order to verify the SC-CNN technique. Do you mean SCL loss in YourTTS?

@p0p4k
Copy link
Author

p0p4k commented Nov 10, 2023

Is there any stylespeechloss in sc-cnn?

@hcy71o
Copy link
Owner

hcy71o commented Dec 29, 2023

Loss terms related to meta-learning are excluded in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants