Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Use depth pass as input mask for new SD 2.0 depth model for more structural coherence #49

Open
marianbasti opened this issue Nov 29, 2022 · 14 comments
Assignees
Labels
enhancement New feature or request

Comments

@marianbasti
Copy link
Contributor

marianbasti commented Nov 29, 2022

Describe the feature you'd like to see:

This is exciting! Alongside the last release, we now have a model that has an extra input for a depth map https://huggingface.co/stabilityai/stable-diffusion-2-depth.

A1111's repo hasn't included this in the API yet, but it is definitely something to have an eye on, as it brings us a step closer to temporal coherence.

Additional information

No response

@benrugg
Copy link
Owner

benrugg commented Nov 29, 2022

Yes! I can't wait for this. I think it will likely provide some really nice power, especially to keep animation more stable. As soon as it's implemented in DreamStudio, Stable Horde or Automatic1111, I'll try to add it quickly!

@benrugg benrugg self-assigned this Nov 29, 2022
@benrugg benrugg added the enhancement New feature or request label Nov 29, 2022
@jacbouzada
Copy link

It would be great if you could implement it !!!!
In my case, I would be very interested in order to control volumetries of architectural images, as it is done in this video: https://www.youtube.com/watch?v=CHfCT2lqNdo

@benrugg
Copy link
Owner

benrugg commented Jan 25, 2023

Yes! This is going to be so useful. The Stability folks have a release planned for next week that will finally support depth2img. I'm planning to add support for it next week (to the integrations with both DreamStudio and Automatic1111).

@benrugg
Copy link
Owner

benrugg commented Apr 14, 2023

This is now done with ControlNet in Automatic1111. The latest AI Render release supports it! Hopefully it will be available in DreamStudio soon.

https://github.com/benrugg/AI-Render/releases/tag/v0.7.5

(or update through AI Render add-on preferences)

@ghost
Copy link

ghost commented Apr 16, 2023

Yeah! You're awesome!

@JensSchmidt72
Copy link

JensSchmidt72 commented Apr 17, 2023

Can you pass the actual rendered depth values (z-buffer) into Control Net, as opposed to a preprocessed guess?
The normals would be awesome too ;)

@benrugg
Copy link
Owner

benrugg commented Apr 17, 2023

@JensSchmidt72 This is next on my list. I had assumed this would be very important, since Blender would have accurate depth info, as compared to ControlNet's estimated depth pass. I experimented with it for a while in the web ui, and realized that the real depth image actually underperformed the estimated one most of the time, so I shelved this feature for later.

(As an example, an object on a table blends the bottom of the object into the table in the real depth pass, vs separating the object from the table in the estimated pass).

@JensSchmidt72
Copy link

JensSchmidt72 commented Apr 19, 2023

Hi Ben :)
If I remember correctly, an artist friend of mine said that he had better success using the Blender "Mist" pass in A1111 than the z-buffer. The blender manual describes the Mist pass as; "Distance to visible surfaces, mapped to the 0.0 - 1.0 range.". It sounds like they map the z-buffer values from camera clip to the farthest pixel, in that case it would naturally give better separation between objects in smaller scenes like the preprocessor does :)

edit: grammar and clarity

@JensSchmidt72
Copy link

Mist info..
I checked the Mist pass in Blender3.5 and for me it defaults to; Start 5m and Depth 25m. In other words, there is no auto-magic normalization going on.
also it's inverted in color compared to the z-buff.

@benrugg
Copy link
Owner

benrugg commented Apr 19, 2023

Ah, yeah, I'll have to check into the miss past. That's a great idea. Sucks that there's not an automatic normalization of some kind.

In my tests with the depth pass, I was sending it to a normalization node in Blender first, which would still make sense. But if the mist pass is set by default to 5-25m, that could easily be all white or all black, huh? I'd need to include more instructions, etc.

@JensSchmidt72
Copy link

Yeah, it could fail to black/white. Of course, UI-wise, it would be amazing one didn't need to adjust any settings but rather got a relevant normalization :)

@benrugg
Copy link
Owner

benrugg commented Apr 21, 2023

Yeah, I usually agonize over the UI to try to make it as easy as possible. (It's so difficult in Blender!)

@JensSchmidt72
Copy link

I have very little knowledge about how addons (and UI-system) are build in blender but I'll take your word for it.
Would a node based approach be easier to implement? It could be more powerful, but it would of course require more blender knowledge to use.

@benrugg
Copy link
Owner

benrugg commented Apr 21, 2023

I think there could be a good case for nodes. There are a few features I could add to AI Render where nodes would be helpful (the depth, mist, normal passes like we're talking about; and also a way to give different prompts for different areas of the image!).

At the moment, this is more than I'm planning on doing, but it could be great in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants