Pitch Loss calculation #183

blueyred · 2024-04-02T13:07:30Z

blueyred
Apr 2, 2024

I've been trying out the variance model generation which is amazing!

I am struggling to get the model to learn the pitch accurately from the data though and I wonder if the sections where there are note rests may be interfering with the training data?

Would it be possible to put in a parameter to exclude the pitch samples from generating loss values where there are note rests (note_rest)?

Just before this code -

DiffSinger/training/variance_task.py

Line 202 in f958001

losses['pitch_loss'] = self.lambda_pitch_loss * self.pitch_loss(

checking for any note rests and then scrub the sections in the pitch_pred which relate to them?

Thanks again for the project!

yqzhishen · 2024-04-02T13:59:35Z

yqzhishen
Apr 2, 2024
Maintainer

Rest notes are common in labels, just like APs and SPs in phoneme transcriptions, so escaping them doesn't seem reasonable. We did not use voiced/unvoiced masks because if we do so, we'll have to predict the masks, which can make the architecture much more complicated than now (and uv masks itself isn't very easy to extract). What we do is to interpolate the pitch curve on unvoiced parts, and we also expect the pitch model to learn this. In practice, this doesn't seem to affect the accuracy. By the way, the pitch_acc metric on TensorBoard will exclude all unvoiced frames so that is the real accuracy. If your model cannot learn well, check your labels or try to enable melody encoder, etc.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pitch Loss calculation #183

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Pitch Loss calculation #183

blueyred Apr 2, 2024

Replies: 1 comment

yqzhishen Apr 2, 2024 Maintainer

blueyred
Apr 2, 2024

yqzhishen
Apr 2, 2024
Maintainer