New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
valid_signal_crop
in validation_step
?
#185
Comments
The cropping is only useful for the training, dropping signal with zero gradients. Cropping it in validation_step would not have that much sense, and would mess with the output dimensionality. Furthermore, audio is not related to curves ; causal configurations are unfortunately limiting the capacity of RAVE modelling, so maybe the sound quality is due to the training and configuration. don't know if @caillonantoine would have additional comments? |
Agree that this change doesn't affect training, only logging. However I'm quite certain it works as described, I've been using it on my fork. Since the beginning part of the reconstruction gets cropped from the loss during training, I believe the model is incentivized to collapse the corresponding latents (i.e., those influenced by zero padding) to the prior. So, the beginning of the reconstruction ends up unrelated to the input. this leads to high reconstruction error when that part isn't cropped at validation time, which makes the validation curve in tensorboard noisy and unreadable. also, I'm quite certain the audio logged is affected. it's the same audio computed in Line 457 in b67a187
this change only shortens the logged audio, by slicing off the 'random' prior-collapsed part. but I find it easier to hear how faithful the reconstructions are this way. |
I noticed when training causal models with RAVE v2 that the validation audio sounds pretty bad. If I'm understanding correctly, it's because V2 crops to the valid (as in convolution) portion of the signal, so the part of the reconstruction which is affected by zero padding (~2/3 of it with v2 defaults) is not trained at all. But
validation_step
doesn't do the same cropping, so the validation curve looks very noisy and the audio sounds bad in tensorboard.Would it make sense to include the same cropping in
validation_step
?The text was updated successfully, but these errors were encountered: