About additional loss #3

p0p4k · 2023-11-07T09:01:49Z

Hello, nice work. I have a question.
Q) how about adding an extra loss at the end of generation to match the spk_enc of reference wav and generated wav? Because I do not see meta-stylespeech's discriminator being used here? (am i missing it somewhere?)

....
s_ref = self.spk_enc(y.transpose(1,2), (y_mask==0).squeeze(1))
....
## freeze spk_enc
s_out = self.spk_enc(y_out.transpose(1,2), (y_out_mask==0).squeeze(1))
# then cosine dist b/w s_ref and s_out
## unfreeze spk_enc

Thanks.

The text was updated successfully, but these errors were encountered:

hcy71o · 2023-11-08T01:12:29Z

This repo has built on Pure VITS & StyleSpeech, in order to verify the SC-CNN technique. Do you mean SCL loss in YourTTS?

p0p4k · 2023-11-10T09:47:06Z

Is there any stylespeechloss in sc-cnn?

hcy71o · 2023-12-29T08:24:01Z

Loss terms related to meta-learning are excluded in this repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About additional loss #3

About additional loss #3

p0p4k commented Nov 7, 2023 •

edited

hcy71o commented Nov 8, 2023

p0p4k commented Nov 10, 2023

hcy71o commented Dec 29, 2023

About additional loss #3

About additional loss #3

Comments

p0p4k commented Nov 7, 2023 • edited

hcy71o commented Nov 8, 2023

p0p4k commented Nov 10, 2023

hcy71o commented Dec 29, 2023

p0p4k commented Nov 7, 2023 •

edited