You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm a researcher working on building a TTS model using diffusion. While looking for the implementation of this, I found this repo.
According to my understanding of the paper, both the processes in the decoder diffusion model, forward and backward diffusion are supposed to take place on the latent space vector z [which is provided by UNET encoder part]. However, the repo's implementation seems to be different from this understanding.
Could you give a reasoning behind this?
The text was updated successfully, but these errors were encountered:
I'm a researcher working on building a TTS model using diffusion. While looking for the implementation of this, I found this repo.
According to my understanding of the paper, both the processes in the decoder diffusion model, forward and backward diffusion are supposed to take place on the latent space vector z [which is provided by UNET encoder part]. However, the repo's implementation seems to be different from this understanding.
Could you give a reasoning behind this?
The text was updated successfully, but these errors were encountered: