Salient Object Aware Background Generation

This repository accompanies our paper, Salient Object-Aware Background Generation using Text-Guided Diffusion Models, which has been accepted for publication in CVPR 2024 Generative Models for Computer Vision workshop. You can try our model on Huggingface.

The paper addresses an issue we call "object expansion" when generating backgrounds for salient objects using inpainting diffusion models. We show that models such as Stable Inpainting can sometimes arbitrarily expand or distort the salient object, which is undesirable in applications where the object's identity should be preserved, such as e-commerce ads. Some examples of object expansion:

Setup

The dependencies are provided in requirements.txt, install them by:

pip install -r requirements.txt

Usage

Training

The following runs the training of text-to-image inpainting ControlNet initialized with the weights of "stable-diffusion-2-inpainting":

accelerate launch --multi_gpu --mixed_precision=fp16 --num_processes=8 train_controlnet_inpaint.py --pretrained_model_name_or_path "stable-diffusion-2-inpainting" --proportion_empty_prompts 0.1

The following runs the training of text-to-image ControlNet initialized with the weights of "stable-diffusion-2-base":

accelerate launch --multi_gpu --mixed_precision=fp16 --num_processes=8 train_controlnet.py --pretrained_model_name_or_path "stable-diffusion-2-base" --proportion_empty_prompts 0.1

Inference

Please refer to inference.ipynb. Tu run the code you need to download our model checkpoints. You can also try our model using Huggingface pipeline:

from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("yahoo-inc/photo-background-generation")

Models Checkpoints

Model link	Datasets used
controlnet_inpainting_salient_aware.pth	Salient segmentation datasets, COCO

Citations

If you found our work useful, please consider citing our paper:

@misc{eshratifar2024salient,
      title={Salient Object-Aware Background Generation using Text-Guided Diffusion Models}, 
      author={Amir Erfan Eshratifar and Joao V. B. Soares and Kapil Thadani and Shaunak Mishra and Mikhail Kuznetsov and Yueh-Ning Ku and Paloma de Juan},
      year={2024},
      eprint={2404.10157},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Maintainers

Erfan Eshratifar: erfan.eshratifar@yahooinc.com
Joao Soares: jvbsoares@yahooinc.com

License

This project is licensed under the terms of the Apache 2.0 open source license. Please refer to LICENSE for the full terms.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
Code_of_Conduct.md		Code_of_Conduct.md
LICENSE		LICENSE
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
inference.ipynb		inference.ipynb
pipeline_controlnet_inpaint.py		pipeline_controlnet_inpaint.py
requirements.txt		requirements.txt
screwdriver.yaml		screwdriver.yaml
train_controlnet.py		train_controlnet.py
train_controlnet_inpaint.py		train_controlnet_inpaint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

Code_of_Conduct.md

Code_of_Conduct.md

LICENSE

LICENSE

PULL_REQUEST_TEMPLATE.md

PULL_REQUEST_TEMPLATE.md

README.md

README.md

inference.ipynb

inference.ipynb

pipeline_controlnet_inpaint.py

pipeline_controlnet_inpaint.py

requirements.txt

requirements.txt

screwdriver.yaml

screwdriver.yaml

train_controlnet.py

train_controlnet.py

train_controlnet_inpaint.py

train_controlnet_inpaint.py

Repository files navigation

Salient Object Aware Background Generation

Setup

Usage

Training

Inference

Models Checkpoints

Citations

Maintainers

License

About

Releases 1

Packages

Contributors 2

Languages

License

yahoo/photo-background-generation

Folders and files

Latest commit

History

Repository files navigation

Salient Object Aware Background Generation

Setup

Usage

Training

Inference

Models Checkpoints

Citations

Maintainers

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages