Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F0 Converter for P - loss function values #36

Open
rishabhjain16 opened this issue Apr 14, 2021 · 7 comments
Open

F0 Converter for P - loss function values #36

rishabhjain16 opened this issue Apr 14, 2021 · 7 comments

Comments

@rishabhjain16
Copy link

I am trying to replicate your work. I am currently making F0 converter model for P checkpoint generation. I am stuck at loss calculation.

I see when I use F0_Converter model to generate P, I get a 257 dimension one-hot encoded feature P.

Demo.ipynb

f0_pred = P(uttr_org_pad, f0_trg_onehot)[0]
f0_pred.shape
> torch.Size([192, 257])

I wanted to ask you when training the F0 converter model, what is the value that you are using to calculate the loss?

I tried using the following value but I am not sure if that is the right way.
This is what I am doing to generate f0_pred and to calculate the loss:

f0_pred = self.P(x_real_org,f0_org_intrp)[0]
p_loss_id = F.mse_loss(f0_pred,f0_org_intrp,reduction='mean')

I just want to know if I am on the right track.
Can you help me out here @auspicious3000

@auspicious3000
Copy link
Owner

The output of the f0 predictor is 257 dim logit instead of one-hot. So, you need to use cross-entropy loss as indicated in the paper.

@rishabhjain16
Copy link
Author

rishabhjain16 commented Apr 14, 2021

Thank you for your quick response.I understand what you are saying. I found that in the appendix of paper. What I meant to ask are the 2 values you are using to calculate the loss. How are you getting the value of f0_orig in 257 dim to feed into the loss function.

Loss function requires 2 values. One is f0_pred which is the output of F0_converter model. What is the other value?

What I am asking is the input for the cross entropy loss?

@auspicious3000
Copy link
Owner

The target is the quantized the ground truth f0, based on https://arxiv.org/abs/2004.07370

@rishabhjain16
Copy link
Author

Thanks for your help. Paper covered most of my doubts. Great read.

@Merlin-721
Copy link

In the 'Train the generator' section of solver.py:

        self.G = self.G.train()
        self.P = self.P.train()
                    
        # G Identity mapping loss
        x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
        x_f0_intrp = self.Interp(x_f0, len_org) 

        f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
        x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

        # G forward
        x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
        g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean') 

        
        # Preprocess f0_trg for P 
        x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1)
        x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg) 

        # Target for P
        f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0]

        # P forward
        f0_pred = self.P(x_real_org,f0_trg_intrp)
        f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2)
        p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean')

        # Backward and optimize.
        g_loss = g_loss_id
        p_loss = p_loss_id
        self.reset_grad()
        g_loss.backward()
        p_loss.backward()
        self.g_optimizer.step()
        self.p_optimizer.step()

        # Logging.
        loss = {}
        loss['G/loss_id'] = g_loss_id.item()
        loss['P/loss_id'] = p_loss_id.item()

This appears to be working for me (ie seems to run at least!)

@3139725181
Copy link

In the 'Train the generator' section of solver.py:

        self.G = self.G.train()
        self.P = self.P.train()
                    
        # G Identity mapping loss
        x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
        x_f0_intrp = self.Interp(x_f0, len_org) 

        f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]
        x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

        # G forward
        x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
        g_loss_id = F.mse_loss(x_real_org, x_pred, reduction='mean') 

        
        # Preprocess f0_trg for P 
        x_f0_trg = torch.cat((x_real_trg, f0_trg), dim=-1)
        x_f0_intrp_trg = self.Interp(x_f0_trg, len_trg) 

        # Target for P
        f0_trg_intrp = quantize_f0_torch(x_f0_intrp_trg[:,:,-1])[0]

        # P forward
        f0_pred = self.P(x_real_org,f0_trg_intrp)
        f0_trg_intrp_indx = f0_trg_intrp.transpose(1,2).argmax(2)
        p_loss_id = F.cross_entropy(f0_pred,f0_trg_intrp_indx, reduction='mean')

        # Backward and optimize.
        g_loss = g_loss_id
        p_loss = p_loss_id
        self.reset_grad()
        g_loss.backward()
        p_loss.backward()
        self.g_optimizer.step()
        self.p_optimizer.step()

        # Logging.
        loss = {}
        loss['G/loss_id'] = g_loss_id.item()
        loss['P/loss_id'] = p_loss_id.item()

This appears to be working for me (ie seems to run at least!)

Hello, I want to now where the x_real_trg come from...

@Merlin-721
Copy link

I've changed some of the code around since, but hopefully this helps a bit. Both 'org' and 'trg' are just different instances. I had just tried applying some of the code from elsewhere in the repo to training so used these naming conventions.
You can see here I've used the same instances to train both models:

            x_real_org, emb_org, f0_org, len_org = next(data_iter)
            # applies .to(self.device) to each:
            x_real_org, emb_org, len_org, f0_org = self.data_to_device([x_real_org, emb_org, len_org, f0_org])

            # combines spect and f0s
            x_f0 = torch.cat((x_real_org, f0_org), dim=-1)
            # Random resampling with linear interpolation
            x_f0_intrp = self.Interp(x_f0, len_org) 
            # strips f0 from trimmed to quantize it
            f0_org_intrp = quantize_f0_torch(x_f0_intrp[:,:,-1])[0]

            self.G = self.G.train()
            # combines quantized f0 back with spect
            x_f0_intrp_org = torch.cat((x_f0_intrp[:,:,:-1], f0_org_intrp), dim=-1)

            # G forward
            x_pred = self.G(x_f0_intrp_org, x_real_org, emb_org)
            g_loss_id = F.mse_loss(x_pred, x_real_org, reduction='mean') 

            # Backward and optimize.
            self.g_optimizer.zero_grad()
            g_loss_id.backward()
            self.g_optimizer.step()

            loss['G/loss_id'] = g_loss_id.item()

          # =================================================================================== #
          #                               3. F0_Converter Training                              #
          # =================================================================================== #


            self.P = self.P.train()
            f0_trg_intrp_indx = f0_org_intrp.argmax(2)

            # P forward
            f0_pred = self.P(x_real_org,f0_org_intrp)
            p_loss_id = F.cross_entropy(f0_pred.transpose(1,2),f0_trg_intrp_indx, reduction='mean')


            self.p_optimizer.zero_grad()
            p_loss_id.backward()
            self.p_optimizer.step()
            loss['P/loss_id'] = p_loss_id.item()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants