F0 Converter for P - loss function values #36

rishabhjain16 · 2021-04-14T18:18:56Z

I am trying to replicate your work. I am currently making F0 converter model for P checkpoint generation. I am stuck at loss calculation.

I see when I use F0_Converter model to generate P, I get a 257 dimension one-hot encoded feature P.

Demo.ipynb

f0_pred = P(uttr_org_pad, f0_trg_onehot)[0]
f0_pred.shape
> torch.Size([192, 257])

I wanted to ask you when training the F0 converter model, what is the value that you are using to calculate the loss?

I tried using the following value but I am not sure if that is the right way.
This is what I am doing to generate f0_pred and to calculate the loss:

f0_pred = self.P(x_real_org,f0_org_intrp)[0]
p_loss_id = F.mse_loss(f0_pred,f0_org_intrp,reduction='mean')

I just want to know if I am on the right track.
Can you help me out here @auspicious3000

The text was updated successfully, but these errors were encountered:

auspicious3000 · 2021-04-14T20:06:36Z

The output of the f0 predictor is 257 dim logit instead of one-hot. So, you need to use cross-entropy loss as indicated in the paper.

rishabhjain16 · 2021-04-14T20:34:50Z

Thank you for your quick response.I understand what you are saying. I found that in the appendix of paper. What I meant to ask are the 2 values you are using to calculate the loss. How are you getting the value of f0_orig in 257 dim to feed into the loss function.

Loss function requires 2 values. One is f0_pred which is the output of F0_converter model. What is the other value?

What I am asking is the input for the cross entropy loss?

auspicious3000 · 2021-04-15T01:37:17Z

The target is the quantized the ground truth f0, based on https://arxiv.org/abs/2004.07370

rishabhjain16 · 2021-04-15T14:24:01Z

Thanks for your help. Paper covered most of my doubts. Great read.

Merlin-721 · 2021-07-21T14:05:52Z

In the 'Train the generator' section of solver.py:

        self.G = self.G.train()
        self.P = self.P.train()
                    
        # G Identity mapping loss
        x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
        x_f0_intrp = self.Interp(x_f0, len_org) 

        f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
        x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

        # G forward
        x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
        g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean') 

        
        # Preprocess f0_trg for P 
        x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1)
        x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg) 

        # Target for P
        f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0]

        # P forward
        f0_pred = self.P(x_real_org,f0_trg_intrp)
        f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2)
        p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean')

        # Backward and optimize.
        g_loss = g_loss_id
        p_loss = p_loss_id
        self.reset_grad()
        g_loss.backward()
        p_loss.backward()
        self.g_optimizer.step()
        self.p_optimizer.step()

        # Logging.
        loss = {}
        loss['G/loss_id'] = g_loss_id.item()
        loss['P/loss_id'] = p_loss_id.item()

This appears to be working for me (ie seems to run at least!)

3139725181 · 2021-09-25T04:19:00Z

In the 'Train the generator' section of solver.py:

        self.G = self.G.train()
        self.P = self.P.train()
                    
        # G Identity mapping loss
        x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
        x_f0_intrp = self.Interp(x_f0, len_org) 

        f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
        x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

        # G forward
        x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
        g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean') 

        
        # Preprocess f0_trg for P 
        x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1)
        x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg) 

        # Target for P
        f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0]

        # P forward
        f0_pred = self.P(x_real_org,f0_trg_intrp)
        f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2)
        p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean')

        # Backward and optimize.
        g_loss = g_loss_id
        p_loss = p_loss_id
        self.reset_grad()
        g_loss.backward()
        p_loss.backward()
        self.g_optimizer.step()
        self.p_optimizer.step()

        # Logging.
        loss = {}
        loss['G/loss_id'] = g_loss_id.item()
        loss['P/loss_id'] = p_loss_id.item()

This appears to be working for me (ie seems to run at least!)

Hello, I want to now where the x_real_trg come from...

Merlin-721 · 2021-09-25T13:26:06Z

I've changed some of the code around since, but hopefully this helps a bit. Both 'org' and 'trg' are just different instances. I had just tried applying some of the code from elsewhere in the repo to training so used these naming conventions.
You can see here I've used the same instances to train both models:

            x_real_org, emb_org, f0_org, len_org = next(data_iter)
            # applies .to(self.device) to each:
            x_real_org, emb_org, len_org, f0_org = self.data_to_device([x_real_org, emb_org, len_org, f0_org])

            # combines spect and f0s
            x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
            # Random resampling with linear interpolation
            x_f0_intrp = self.Interp(x_f0, len_org) 
            # strips f0 from trimmed to quantize it
            f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]

            self.G = self.G.train()
            # combines quantized f0 back with spect
            x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

            # G forward
            x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
            g_loss_id = F.mse_loss(x_pred, x_real_org, reduction='mean') 

            # Backward and optimize.
            self.g_optimizer.zero_grad()
            g_loss_id.backward()
            self.g_optimizer.step()

            loss['G/loss_id'] = g_loss_id.item()

          # =================================================================================== #
          #                               3. F0_Converter Training                              #
          # =================================================================================== #


            self.P = self.P.train()
            f0_trg_intrp_indx = f0_org_intrp.argmax(2)

            # P forward
            f0_pred = self.P(x_real_org,f0_org_intrp)
            p_loss_id = F.cross_entropy(f0_pred.transpose(1,2),f0_trg_intrp_indx, reduction='mean')


            self.p_optimizer.zero_grad()
            p_loss_id.backward()
            self.p_optimizer.step()
            loss['P/loss_id'] = p_loss_id.item()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

F0 Converter for P - loss function values #36

F0 Converter for P - loss function values #36

rishabhjain16 commented Apr 14, 2021

auspicious3000 commented Apr 14, 2021

rishabhjain16 commented Apr 14, 2021 •

edited

auspicious3000 commented Apr 15, 2021

rishabhjain16 commented Apr 15, 2021

Merlin-721 commented Jul 21, 2021

3139725181 commented Sep 25, 2021

Merlin-721 commented Sep 25, 2021

F0 Converter for P - loss function values #36

F0 Converter for P - loss function values #36

Comments

rishabhjain16 commented Apr 14, 2021

auspicious3000 commented Apr 14, 2021

rishabhjain16 commented Apr 14, 2021 • edited

auspicious3000 commented Apr 15, 2021

rishabhjain16 commented Apr 15, 2021

Merlin-721 commented Jul 21, 2021

3139725181 commented Sep 25, 2021

Merlin-721 commented Sep 25, 2021

rishabhjain16 commented Apr 14, 2021 •

edited