What is the importance of DDIM? #96

vedantroy · 2023-10-18T20:00:07Z

I am working on integrating this repository with ComfyUI, so this project can get support for ControlNet, LoRA, etc. all for free from ComfyUI's infrastructure.

The main issue is, I need to understand exactly how the DDIM sampler was hacked.
From reading the paper, it looks like it was patched to support cross-frame attention.
This makes me guess that DDIM is not of importance here. i.e., the standard diffusion schedule could be used as well if it was patched. Is this an incorrect assumption?

williamyang1991 · 2023-10-19T02:19:36Z

Thank you for your interest.

Almost not important.
In DDIM, we use two parts.

One is that it can predict a noised version $\hat{x}_{t\rightarrow0}$, which we can warp it.
One is the adding noise process. We rescale the noise level so that the fuse of two noisy latent has the same noise level as before fusing.

Rerender_A_Video/src/ddim_v_hacked.py

Lines 322 to 323 in fcb7431

    
           img = img_ref * weight + (1. - weight) * ( 
        
               img - dir_xt) + rescale * dir_xt

We have two latent img_ref (encoded warped image $\tilde{x}_{t-1}$) and img ($x_{t-1}$), they both have the same noise level of $\sqrt{1-\alpha_{t-1}}$ (standard deviation).
We would like to fuse them with a soft weight (ranging 0~1, $M$, in the paper we use a hard binary mask rather than this soft one in this repository. Here we use the soft one to prevent error accumulation),
then the resulting img_ref * weight + (1. - weight) * img ($(1-M)\tilde{x}_{t-1}+Mx_{t-1}$) has the noise level of $\sqrt{(M^2+(1-M)^2)(1-\alpha_{t-1})}$ as standard deviation, which is smaller than $\sqrt{1-\alpha_{t-1}}$ if $M$ is not 0 or 1. In this case, the final image reults will be burry.

So I rescale the dir_xt (direction pointing to $x_{t-1}$) so make the final fused result still have a noise level of $\sqrt{1-\alpha_{t-1}}$.

These two points are where we use the DDIM.
If you are using other schedule, you need to find its predicted $\hat{x}_{t \rightarrow 0}$ to warp and don't forget to somehow rescale the noise level when fusing $\tilde{x}_{t-1}$ and $x_{t-1}$ (in different schedules, the denifition of dir_xt may different.).

For other parts, I think DDIM is not important.

vedantroy changed the title ~~What is the importance of DDIM / where is the original DDIM source?~~ What is the importance of DDIM? Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the importance of DDIM? #96

What is the importance of DDIM? #96

vedantroy commented Oct 18, 2023 •

edited

williamyang1991 commented Oct 19, 2023

What is the importance of DDIM? #96

What is the importance of DDIM? #96

Comments

vedantroy commented Oct 18, 2023 • edited

williamyang1991 commented Oct 19, 2023

vedantroy commented Oct 18, 2023 •

edited