PIT Loss for multichannel audio for speech separation #691

SutirthaChakraborty · 2024-02-25T00:49:31Z

I have a 4 channel audio generated by my model (left,right,side,mid).
I can I apply PIT loss into it
The shape of the tensors are
Speaker one : [batch,channel,time]
Speaker two: [batch,channel,time]

If I need to apply PIT, how should I apply : [batch,channel,speaker,time] ?

if I convert it to mono, or take the mean, the model is unable to learn 4 channels properly.

mpariente · 2024-03-02T09:02:08Z

I think the channel should be first, in order to build the permutation matrix of dimension (batch, speaker, speaker) with broadcasting.

SutirthaChakraborty added the question Further information is requested label Feb 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PIT Loss for multichannel audio for speech separation #691

PIT Loss for multichannel audio for speech separation #691

SutirthaChakraborty commented Feb 25, 2024 •

edited

mpariente commented Mar 2, 2024

PIT Loss for multichannel audio for speech separation #691

PIT Loss for multichannel audio for speech separation #691

Comments

SutirthaChakraborty commented Feb 25, 2024 • edited

mpariente commented Mar 2, 2024

SutirthaChakraborty commented Feb 25, 2024 •

edited