Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

audio distance and PQMF in 2.3.1 #302

Open
victor-shepardson opened this issue Mar 13, 2024 · 1 comment
Open

audio distance and PQMF in 2.3.1 #302

victor-shepardson opened this issue Mar 13, 2024 · 1 comment

Comments

@victor-shepardson
Copy link

fullband spectral distance is now computed between the original waveform and the model reconstruction:

x_raw = batch

fullband_distance = self.audio_distance(x_raw, y_raw)

compared to older versions, where it was computed between the model reconstruction and the PQMF reconstruction of the data:

RAVE/rave/model.py

Lines 239 to 249 in 7661926

x = self.pqmf.inverse(x_multiband)
y = self.pqmf.inverse(y_multiband)
p.tick('recompose')
for k, v in multiband_distance.items():
distances[f'multiband_{k}'] = v
else:
x = x_multiband
y = y_multiband
fullband_distance = self.audio_distance(x, y)

iiuc this should mainly affect causal models, but it seems like a problem in that case, since a causal model would not be able to compensate for the delay induced by the PQMF, unless I misunderstand?

do you know if the change was deliberate or why it was made?

@domkirke
Copy link
Collaborator

domkirke commented Apr 22, 2024

Hello Victor,

I am finally running through your issues. I actually change the code to be more general as the older ones were a little tweaked, but the inner behavior did not change : a multiband loss for each PQMF band, and a fullband loss for the global signal. And, regarding causality : causality only change the internal padding of convolutional layers, such that a given latent position does not influence the future of the sequence. Furthermore, reconstruction is done on both slightly-delayed PQMF, and is then trained to compensate it with the fullband distance. Some improvements could be done here though, even if nothing comes into my mind right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants