Skip to content

Picsart-AI-Research/PAIR-Diffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor

[Project Page] [arXiv] [pdf] [BibTeX]

Framework: PyTorch HuggingFace space YouTube License

Vidit Goel1, Elia Peruzzo1,2, Yifan Jiang3, Dejia Xu3, Xingqian Xu3, Nicu Sebe2, Trevor Darrell4, Zhangyang Wang1,3 and Humphrey Shi 1,5,6

Features

All the operations below can be performed at an object level. Our framework is general and can be applied to any diffusion model.

  1. Appearance Editing
  2. Free Form Shape Editing
  3. Adding Objects
  4. Variation
  5. Multimodal control using reference images and text

News

  • [30/12/2023] Models and code released.
  • [10/11/2023] New and improved method and models 🚀🚀. Models and code will be released soon.
  • [04/09/2023] Inference code released
  • [04/07/2023] Demo relased on 🤗Huggingface space!
  • [03/30/2023] Paper released on arXiv

Results

Given below are results for appearace editing using our method on SDv1.5

Object Level Image Editing

Appearance Editing

Free form Shape Editing and Adding Object

For more results please refer to our project page and paper.

Requirements

Setup the conda environment using the command below. We use Oneformer to get segmentation maps during inference, please setup environment for Oneformer following the repo

conda env create -f environment.yml
conda activate pair-diff

Inference

To run the model launch the gradio demo using the command below. It will download the required models as well.

python gradio_demo.py

Pretrained Models

We applied PAIR Diffusion on SDv1.5 and uses COCO-Stuff dataset for finetuning the model. The model card can be downloaded from here

BibTeX

If you use our work in your research, please cite our publication:

@article{goel2023pair,
      title={PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models},
      author={Goel, Vidit and Peruzzo, Elia and Jiang, Yifan and Xu, Dejia and Sebe, Nicu and Darrell, Trevor and 
      Wang, Zhangyang and Shi, Humphrey},
      journal={arXiv preprint arXiv:2303.17546},
      year={2023} 
      }

About

[CVPR 2024] PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages