Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contradiction in the Appendix regarding the Latent DDIM #62

Open
anthony-mendil opened this issue Aug 31, 2023 · 2 comments
Open

Contradiction in the Appendix regarding the Latent DDIM #62

anthony-mendil opened this issue Aug 31, 2023 · 2 comments

Comments

@anthony-mendil
Copy link

In the Appendix A.1 of your paper it is stated that the skip connections concatenate the input
with the output from the previous layer. But in the visualization in figure 8 as well as the
provided code the connection is instead from the original input to all the layers.
(Not from each previous layer).

Intuitively, I would think that the type of residual connection described in the text
would make more sense for the Latent DDIM. It would function similar to Transformers but
without the Multi-Head Attention. I would think that the skips in the code and the figure
are more suited for cases in which information is compressed by the layers, which is not the case
in the Latent DDIM.

I would be very thankful for your insight on this topic.
Kind regards, Anthony Mendil.

@phizaz
Copy link
Owner

phizaz commented Aug 31, 2023

I'm sorry if the sentence: "Each layer of the MLP has a skip connection from the input" is ambiguous. But it has the meaning as you understood, concatenating from the original input to each of the layers!

I have tried residual connection for sure. I found concatenation works better. Another point, architecture details seem to matter less than the amount of parameters in the latent DDIM. This is mostly a hunch, I have no solid evidence for that though.

image

@anthony-mendil
Copy link
Author

Thanks for the clarification and your insight!
The contradiction was rather in the next part of the sentence:
"which simply concatenates the input with the output from the previous layer".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants