Awesome Diffusion Categorized

Storytelling

⭐⭐Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
[CVPR 2024] [Project] [Code]

⭐⭐Training-Free Consistent Text-to-Image Generation
[Website] [Project] [Code]

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
[Website] [Project] [Code]

StoryGPT-V: Large Language Models as Consistent Story Visualizers
[Website] [Project] [Code]

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
[Website] [Project] [Code]

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
[Website] [Project] [Code]

TaleCrafter: Interactive Story Visualization with Multiple Characters
[Website] [Project] [Code]

Make-A-Story: Visual Memory Conditioned Consistent Story Generation
[CVPR 2023] [Code]

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
[Website] [Code]

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
[Website] [Code]

Masked Generative Story Transformer with Character Guidance and Caption Augmentation
[Website] [Code]

StoryBench: A Multifaceted Benchmark for Continuous Story Visualization
[Website] [Code]

MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising
[Website] [Project]

Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning For Visual Story Synthesis
[ICASSP 2024]

CogCartoon: Towards Practical Story Visualization
[Website]

Generating coherent comic with rich story using ChatGPT and Stable Diffusion
[Website]

Improved Visual Story Generation with Adaptive Context Modeling
[Website]

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
[Website]

Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
[Website]

Try On

TryOnDiffusion: A Tale of Two UNets
[CVPR 2023] [Website] [Project] [Code]

StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
[CVPR 2024] [Project] [Code]

Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images
[Website] [Project] [Code]

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
[Website] [Project] [Code]

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
[Website] [Project] [Code]

Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow
[ACM MM 2023] [Code]

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
[ACM MM 2023] [Code]

OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
[Website] [Code]

DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling
[Website] [Code]

CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model
[Website] [Code]

MV-VTON: Multi-View Virtual Try-On with Diffusion Models
[Website] [Code]

StableGarment: Garment-Centric Generation via Stable Diffusion
[Website] [Project]

Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
[Website] [Project]

Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
[Website] [Project]

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
[Website] [Project]

WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on
[Website]

FLDM-VTON: Faithful Latent Diffusion Model for Virtual Try-on
[Website]

Product-Level Try-on: Characteristics-preserving Try-on with Realistic Clothes Shading and Wrinkles
[Website]

Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models
[Website]

Improving Diffusion Models for Virtual Try-on
[Website]

Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models
[Website]

ACDG-VTON: Accurate and Contained Diffusion Generation for Virtual Try-On
[Website]

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On
[Website]

ShoeModel: Learning to Wear on the User-specified Shoes via Diffusion Model
[Website]

Drag Edit

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models
[ICLR 2024] [Website] [Project] [Code]

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
[SIGGRAPH 2023] [Project] [Code]

FreeDrag: Feature Dragging for Reliable Point-based Image Editing
[CVPR 2024] [Project] [Code]

DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
[CVPR 2024] [Project] [Code]

GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
[Website] [Project] [Code]

Repositioning the Subject within Image
[Website] [Project] [Code]

Drag-A-Video: Non-rigid Video Editing with Point-based Interaction
[Website] [Project] [Code]

DragAnything: Motion Control for Anything using Entity Representation
[Website] [Project] [Code]

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
[CVPR 2024] [Code]

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
[CVPR 2024] [Code]

DragVideo: Interactive Drag-style Video Editing
[Website] [Code]

RotationDrag: Point-based Image Editing with Rotated Diffusion Features
[Website] [Code]

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
[Website] [Project]

Readout Guidance: Learning Control from Diffusion Features
[Website] [Project]

StableDrag: Stable Dragging for Point-based Image Editing
[Website] [Project]

Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators
[Website]

Diffusion Models Inversion

⭐⭐⭐Null-text Inversion for Editing Real Images using Guided Diffusion Models
[CVPR 2023] [Website] [Project] [Code]

⭐⭐Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
[ICLR 2024] [Website] [Project] [Code]

⭐Inversion-Based Creativity Transfer with Diffusion Models
[CVPR 2023] [Website] [Code]

⭐EDICT: Exact Diffusion Inversion via Coupled Transformations
[CVPR 2023] [Website] [Code]

⭐Improving Negative-Prompt Inversion via Proximal Guidance
[Website] [Code]

An Edit Friendly DDPM Noise Space: Inversion and Manipulations
[CVPR 2024] [Project] [Code] [Demo]

Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing
[NeurIPS 2023] [Website] [Code]

Inversion-Free Image Editing with Natural Language
[CVPR 2024] [Project] [Code]

Noise Map Guidance: Inversion with Spatial Context for Real Image Editing
[ICLR 2024] [Website] [Code]

IterInv: Iterative Inversion for Pixel-Level T2I Models
[NeurIPS-W 2023] [Openreview] [NeuripsW] [Website] [Code]

Object-aware Inversion and Reassembly for Image Editing
[Website] [Project] [Code]

ReNoise: Real Image Inversion Through Iterative Noising
[Website] [Project] [Code]

LocInv: Localization-aware Inversion for Text-Guided Image Editing
[CVPR 2024 AI4CC workshop] [Code]

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
[Website] [Code]

Generating Non-Stationary Textures using Self-Rectification
[Website] [Code]

Accelerating Diffusion Models for Inverse Problems through Shortcut Sampling
[Website] [Code]

Exact Diffusion Inversion via Bi-directional Integration Approximation
[Website] [Code]

Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing
[Website] [Code]

Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models
[Website] [Code]

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
[ICCV 2023 Oral] [Website]

Score-Based Diffusion Models as Principled Priors for Inverse Imaging
[ICCV 2023] [Website]

BARET : Balanced Attention based Real image Editing driven by Target-text Inversion
[WACV 2024]

Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image Editing
[ICASSP 2024]

Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models
[Website]

Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models
[Website]

Fixed-point Inversion for Text-to-image diffusion models
[Website]

Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
[Website]

KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing
[Website]

Tuning-Free Inversion-Enhanced Control for Consistent Image Editing
[Website]

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance
[Website]

LEDITS++: Limitless Image Editing using Text-to-Image Models
[Website]

Text Guided Image Editing

⭐⭐⭐Prompt-to-Prompt Image Editing with Cross Attention Control
[ICLR 2023] [Website] [Project] [Code] [Replicate Demo]

⭐⭐⭐Zero-shot Image-to-Image Translation
[SIGGRAPH 2023] [Project] [Code] [Replicate Demo] [Diffusers Doc] [Diffusers Code]

⭐⭐InstructPix2Pix: Learning to Follow Image Editing Instructions
[CVPR 2023 (Highlight)] [Website] [Project] [Diffusers Doc] [Diffusers Code] [Official Code] [Dataset]

⭐⭐Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
[CVPR 2023] [Website] [Project] [Code] [Dataset] [Replicate Demo] [Demo]

⭐DiffEdit: Diffusion-based semantic image editing with mask guidance
[ICLR 2023] [Website] [Unofficial Code] [Diffusers Doc] [Diffusers Code]

⭐Imagic: Text-Based Real Image Editing with Diffusion Models
[CVPR 2023] [Website] [Project] [Diffusers]

⭐Inpaint Anything: Segment Anything Meets Image Inpainting
[Website] [Code 1] [Code 2]

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
[ICCV 2023] [Website] [Project] [Code] [Demo]

Collaborative Score Distillation for Consistent Visual Synthesis
[NeurIPS 2023] [Website] [Project] [Code]

Visual Instruction Inversion: Image Editing via Visual Prompting
[NeurIPS 2023] [Website] [Project] [Code]

Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models
[NeurIPS 2023] [Website] [Code]

Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
[ICCV 2023] [Website] [Project] [Code]

Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance
[Website] [Code1] [Code2] [Diffusers Code]

PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models
[Website] [Project] [Code] [Demo]

SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
[CVPR 2024] [Project] [Code]

MultiBooth: Towards Generating All Your Concepts in an Image from Text
[Website] [Project] [Code]

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting
[Website] [Project] [Code]

StyleBooth: Image Style Editing with Multimodal Instruction
[Website] [Project] [Code]

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
[Website] [Project] [Code]

EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
[Website] [Project] [Code]

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
[Website] [Project] [Code]

Text-Driven Image Editing via Learnable Regions
[Website] [Project] [Code]

Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
[Website] [Project] [Code]

MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
[Website] [Project] [Code]

HIVE: Harnessing Human Feedback for Instructional Visual Editing
[Website] [Project] [Code]

FaceStudio: Put Your Face Everywhere in Seconds
[Website] [Project] [Code]

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
[Website] [Project] [Code]

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance
[Website] [Project] [Code]

MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image Translation by Prompts Redescription and Beyond
[Website] [Project] [Code]

Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators
[Website] [Project] [Code]

UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image
[SIGGRAPH 2023] [Code]

Learning to Follow Object-Centric Image Editing Instructions Faithfully
[EMNLP 2023] [Code]

TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing
[CVPR 2024] [Code]

ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing
[Website] [Code]

Differential Diffusion: Giving Each Pixel Its Strength
[Website] [Code]

Region-Aware Diffusion for Zero-shot Text-driven Image Editing
[Website] [Code]

Forgedit: Text Guided Image Editing via Learning and Forgetting
[Website] [Code]

AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for Text-Based Continuity-Sensitive Image Editing
[Website] [Code]

Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation
[Website] [Code]

Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image Guidance
[Website] [Code]

SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing
[Website] [Code]

Conditional Score Guidance for Text-Driven Image-to-Image Translation
[NeurIPS 2023] [Website]

Emu Edit: Precise Image Editing via Recognition and Generation Tasks
[CVPR 2024] [Project]

Editable Image Elements for Controllable Synthesis
[Website] [Project]

LIME: Localized Image Editing via Attention Regularization in Diffusion Models
[Website] [Project]

Watch Your Steps: Local Image and Scene Editing by Text Instructions
[Website] [Project]

Delta Denoising Score
[Website] [Project]

ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
[Website] [Project]

ByteEdit: Boost, Comply and Accelerate Generative Image Editing
[Website] [Project]

GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models
[Website] [Project]

MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers
[Website] [Project]

FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
[Website] [Project]

GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
[Website] [Project]

Iterative Multi-granular Image Editing using Diffusion Models
[WACV 2024]

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
[AAAI 2024]

Text-to-image Editing by Image Information Removal
[WACV 2024]

Face Aging via Diffusion-based Editing
[BMVC 2023]

Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models
[CVPR 2023 AI4CC Workshop]

TexSliders: Diffusion-Based Texture Editing in CLIP Space
[Website]

DM-Align: Leveraging the Power of Natural Language Instructions to Make Changes to Images
[Website]

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference
[Website]

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
[Website]

iEdit: Localised Text-guided Image Editing with Weak Supervision
[Website]

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques
[Website]

PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing
[Website]

PRedItOR: Text Guided Image Editing with Diffusion Prior
[Website]

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
[Website]

FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image Editing
[Website]

The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing
[Website]

ZONE: Zero-Shot Instruction-Guided Local Editing
[Website]

Image Translation as Diffusion Visual Programmers
[Website]

Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing
[Website]

LoMOE: Localized Multi-Object Editing via Multi-Diffusion
[Website]

Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
[Website]

An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control
[Website]

DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation
[Website]

InstructGIE: Towards Generalizable Image Editing
[Website]

DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation
[Website]

LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing
[Website]

Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing
[Website]

Uncovering the Text Embedding in Text-to-Image Diffusion Models
[Website]

Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer
[Website]

FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
[Website]

Continual Learning

RGBD2: Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models
[CVPR 2023] [Website] [Project] [Code]

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
[Website] [Code]

Continual Learning of Diffusion Models with Generative Distillation
[Website] [Code]

Prompt-Based Exemplar Super-Compression and Regeneration for Class-Incremental Learning
[Website] [Code]

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
[Website] [Project]

Class-Incremental Learning using Diffusion Model for Distillation and Replay
[ICCV 2023 VCL workshop best paper]

Create Your World: Lifelong Text-to-Image Diffusion
[Website]

Exploring Continual Learning of Diffusion Models
[Website]

DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency
[Website]

DiffusePast: Diffusion-based Generative Replay for Class Incremental Semantic Segmentation
[Website]

Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters
[Website]

Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning
[Website]

MuseumMaker: Continual Style Customization without Catastrophic Forgetting
[Website]

Remove Concept

Ablating Concepts in Text-to-Image Diffusion Models
[ICCV 2023] [Website] [Project] [Code]

Erasing Concepts from Diffusion Models
[ICCV 2023] [Website] [Project] [Code]

Paint by Inpaint: Learning to Add Image Objects by Removing Them First
[Website] [Project] [Code]

One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
[Website] [Project] [Code]

Editing Massive Concepts in Text-to-Image Diffusion Models
[Website] [Project] [Code]

Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models
[ICML 2023 workshop] [Code]

ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
[Website] [Code]

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
[Website] [Code]

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
[Website] [Code]

MACE: Mass Concept Erasure in Diffusion Models
[CVPR 2024]

Geom-Erasing: Geometry-Driven Removal of Implicit Concept in Diffusion Models
[Website]

Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
[Website]

All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models
[Website]

EraseDiff: Erasing Data Influence in Diffusion Models
[Website]

UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models
[Website]

Removing Undesirable Concepts in Text-to-Image Generative Models with Learnable Prompts
[Website]

New Concept Learning

⭐⭐⭐DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
[CVPR 2023 Honorable Mention] [Website] [Project] [Official Dataset] [Unofficial Code] [Diffusers Doc] [Diffusers Code]

⭐⭐⭐An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
[ICLR 2023 top-25%] [Website] [Diffusers Doc] [Diffusers Code] [Code]

⭐⭐Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion
[CVPR 2023] [Website] [Project] [Diffusers Doc] [Diffusers Code] [Code]

⭐⭐ReVersion: Diffusion-Based Relation Inversion from Images
[Website] [Project] [Code]

⭐SINE: SINgle Image Editing with Text-to-Image Diffusion Models
[CVPR 2023] [Website] [Project] [Code]

⭐Break-A-Scene: Extracting Multiple Concepts from a Single Image
[SIGGRAPH Asia 2023] [Project] [Code]

⭐Concept Decomposition for Visual Exploration and Inspiration
[SIGGRAPH Asia 2023] [Project] [Code]

Cones: Concept Neurons in Diffusion Models for Customized Generation
[ICML 2023 Oral] [ICML 2023 Oral] [Website] [Code]

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
[NeurIPS 2023] [Website] [Project] [Code]

Inserting Anybody in Diffusion Models via Celeb Basis
[NeurIPS 2023] [Website] [Project] [Code]

Controlling Text-to-Image Diffusion by Orthogonal Finetuning
[NeurIPS 2023] [Website] [Project] [Code]

Photoswap: Personalized Subject Swapping in Images
[NeurIPS 2023] [Website] [Project] [Code]

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[NeurIPS 2023] [Website] [Project] [Code]

ITI-GEN: Inclusive Text-to-Image Generation
[ICCV 2023 Oral] [Website] [Project] [Code]

Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
[ICCV 2023] [Website] [Project] [Code]

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
[ICCV 2023 Oral] [Website] [Code]

A Neural Space-Time Representation for Text-to-Image Personalization
[SIGGRAPH Asia 2023] [Project] [Code]

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models
[SIGGRAPH 2023] [Project] [Code]

Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation
[NeurIPS 2023] [Website] [Code]

Face2Diffusion for Fast and Editable Face Personalization
[CVPR 2024] [Project] [Code]

Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
[CVPR 2024] [Project] [Code]

Style Aligned Image Generation via Shared Attention
[CVPR 2024] [Project] [Code]

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization
[CVPR 2024] [Project] [Code]

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
[AAAI 2024] [Project] [Code]

MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation
[Website] [Project] [Code]

Customizing Text-to-Image Models with a Single Image Pair
[Website] [Project] [Code]

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
[Website] [Project] [Code]

ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning
[Website] [Project] [Code]

CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
[Website] [Project] [Code]

Customizing Text-to-Image Diffusion with Camera Viewpoint Control
[Website] [Project] [Code]

Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
[Website] [Project] [Code]

ZeST: Zero-Shot Material Transfer from a Single Image
[Website] [Project] [Code]

Material Palette: Extraction of Materials from a Single Image
[Website] [Project] [Code]

StyleDrop: Text-to-Image Generation in Any Style
[Website] [Project] [Code]

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
[Website] [Project] [Code]

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
[Website] [Project] [Code]

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
[Website] [Project] [Code]

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion
[Website] [Project] [Code]

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning
[Website] [Project] [Code]

The Hidden Language of Diffusion Models
[Website] [Project] [Code]

SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing
[Website] [Project] [Code]

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
[Website] [Project] [Code]

When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation
[Website] [Project] [Code]

InstantID: Zero-shot Identity-Preserving Generation in Seconds
[Website] [Project] [Code]

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
[Website] [Project] [Code]

CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization
[Website] [Project] [Code]

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models
[Website] [Project] [Code]

CapHuman: Capture Your Moments in Parallel Universes
[Website] [Project] [Code]

λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
[Website] [Project] [Code]

Learning Continuous 3D Words for Text-to-Image Generation
[Website] [Project] [Code]

Viewpoint Textual Inversion: Unleashing Novel View Synthesis with Pretrained 2D Diffusion Models
[Website] [Project] [Code]

Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
[Website] [Project] [Code]

DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model
[Website] [Project] [Code]

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models
[Website] [Project] [Code]

ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image Generation
[SIGGRAPH Asia 2023] [Code]

Multiresolution Textual Inversion
[NeurIPS 2022 workshop] [Code]

Compositional Inversion for Stable Diffusion Models
[AAAI 2024] [Code]

PuLID: Pure and Lightning ID Customization via Contrastive Alignment
[Website] [Code]

Cross Initialization for Personalized Text-to-Image Generation
[Website] [Code]

Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach
[Website] [Code]

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
[Website] [Code]

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation
[Website] [Code]

AerialBooth: Mutual Information Guidance for Text Controlled Aerial View Synthesis from a Single Image
[Website] [Code]

A Closer Look at Parameter-Efficient Tuning in Diffusion Models
[Website] [Code]

Controllable Textual Inversion for Personalized Text-to-Image Generation
[Website] [Code]

Cross-domain Compositing with Pretrained Diffusion Models
[Website] [Code]

Concept-centric Personalization with Large-scale Diffusion Priors
[Website] [Code]

Customization Assistant for Text-to-image Generation
[Website] [Code]

Cross Initialization for Personalized Text-to-Image Generation
[Website] [Code]

High-fidelity Person-centric Subject-to-Image Synthesis
[Website] [Code]

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models
[Website] [Code]

Language-Informed Visual Concept Learning
[ICLR 2024] [Project]

Key-Locked Rank One Editing for Text-to-Image Personalization
[SIGGRAPH 2023] [Project]

Diffusion in Style
[ICCV 2023] [Project]

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
[CVPR 2024] [Project]

MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
[Website] [Project]

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
[Website] [Project]

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization
[Website] [Project]

Subject-driven Text-to-Image Generation via Apprenticeship Learning
[Website] [Project]

Orthogonal Adaptation for Modular Customization of Diffusion Models
[Website] [Project]

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
[Website] [Project]

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
[Website] [Project]

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
[Website] [Project]

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models
[Website] [Project]

$P+$: Extended Textual Conditioning in Text-to-Image Generation
[Website] [Project]

PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
[Website] [Project]

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
[Website] [Project]

Total Selfie: Generating Full-Body Selfies
[Website] [Project]

DreamTuner: Single Image is Enough for Subject-Driven Generation
[Website] [Project]

PALP: Prompt Aligned Personalization of Text-to-Image Models
[Website] [Project]

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
[Website] [Project]

Direct Consistency Optimization for Compositional Text-to-Image Personalization
[Website] [Project]

Visual Style Prompting with Swapping Self-Attention
[Website] [Project]

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm
[Website] [Project]

Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models
[CVPR 2024]

DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
[AAAI 2024]

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
[Website]

InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser
[Website]

DisenBooth: Disentangled Parameter-Efficient Tuning for Subject-Driven Text-to-Image Generation
[Website]

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
[Website]

Gradient-Free Textual Inversion
[Website]

Identity Encoder for Personalized Diffusion
[Website]

Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation
[Website]

ELODIN: Naming Concepts in Embedding Spaces
[Website]

Cones 2: Customizable Image Synthesis with Multiple Subjects
[Website]

Generate Anything Anywhere in Any Scene
[Website]

Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
[Website]

Face0: Instantaneously Conditioning a Text-to-Image Model on a Face
[Website]

MagiCapture: High-Resolution Multi-Concept Portrait Customization
[Website]

A Data Perspective on Enhanced Identity Preservation for Diffusion Personalization
[Website]

DIFFNAT: Improving Diffusion Image Quality Using Natural Image Statistics
[Website]

An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis
[Website]

Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models
[Website]

Memory-Efficient Personalization using Quantized Diffusion Model
[Website]

BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
[Website]

Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization
[Website]

Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding
[Website]

StableIdentity: Inserting Anybody into Anywhere at First Sight
[Website]

SeFi-IDE: Semantic-Fidelity Identity Embedding for Personalized Diffusion-Based Generation
[Website]

Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
[Website]

ComFusion: Personalized Subject Generation in Multiple Specific Scenes From Single Image
[Website]

IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
[Website]

MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration
[Website]

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
[Website]

OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
[Website]

T2I Diffusion Model augmentation

⭐⭐⭐Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
[SIGGRAPH 2023] [Project] [Official Code] [Diffusers Code] [Diffusers doc] [Replicate Demo]

SEGA: Instructing Diffusion using Semantic Dimensions
[NeurIPS 2023] [Website] [Code] [Diffusers Code] [Diffusers Doc]

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance
[ICCV 2023] [Website] [Project] [Code Official] [Diffusers Doc] [Diffusers Code]

Expressive Text-to-Image Generation with Rich Text
[ICCV 2023] [Website] [Project] [Code] [Demo]

Editing Implicit Assumptions in Text-to-Image Diffusion Models
[ICCV 2023] [Website] [Project] [Code] [Demo]

ElasticDiffusion: Training-free Arbitrary Size Image Generation
[CVPR 2024] [Project] [Code] [Demo]

MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models
[ICCV 2023] [Website] [Project] [Code]

Discriminative Class Tokens for Text-to-Image Diffusion Models
[ICCV 2023] [Website] [Project] [Code]

Compositional Visual Generation with Composable Diffusion Models
[ECCV 2022] [Website] [Project] [Code]

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
[ICCV 2023] [Project] [Code] [Blog]

ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
[NeurIPS 2023] [Website] [Code]

Diffusion Self-Guidance for Controllable Image Generation
[NeurIPS 2023] [Website] [Project] [Code]

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
[NeurIPS 2023] [Website] [Code]

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
[NeurIPS 2023] [Website] [Code]

DemoFusion: Democratising High-Resolution Image Generation With No $$$
[CVPR 2024] [Project] [Code]

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
[CVPR 2024] [Project] [Code]

Divide & Bind Your Attention for Improved Generative Semantic Nursing
[BMVC 2023 Oral] [Project] [Code]

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
[Website] [Project] [Code]

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
[Website] [Project] [Code]

Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
[Website] [Project] [Code]

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
[Website] [Project] [Code]

Real-World Image Variation by Aligning Diffusion Inversion Chain
[Website] [Project] [Code]

FreeU: Free Lunch in Diffusion U-Net
[Website] [Project] [Code]

ConceptLab: Creative Generation using Diffusion Prior Constraints
[Website] [Project] [Code]

Aligning Text-to-Image Diffusion Models with Reward Backpropagationn
[Website] [Project] [Code]

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
[Website] [Project] [Code]

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
[Website] [Project] [Code]

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls
[Website] [Project] [Code]

TokenCompose: Grounding Diffusion with Token-level Supervision
[Website] [Project] [Code]

DiffusionGPT: LLM-Driven Text-to-Image Generation System
[Website] [Project] [Code]

Decompose and Realign: Tackling Condition Misalignment in Text-to-Image Diffusion Models
[Website] [Project] [Code]

Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support
[Website] [Project] [Code]

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
[Website] [Project] [Code]

MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion
[Website] [Project] [Code]

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models
[Website] [Project] [Code]

Stylus: Automatic Adapter Selection for Diffusion Models
[Website] [Project] [Code]

MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
[Website] [Project] [Code]

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
[Website] [Project] [Code]

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
[Website] [Project] [Code]

SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models
[ACM MM 2023 Oral] [Code]

Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models
[ICLR 2024] [Code]

Dynamic Prompt Optimizing for Text-to-Image Generation
[CVPR 2024] [Code]

Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models
[CVPR 2024] [Code]

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
[CVPR 2024] [Code]

InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization
[CVPR 2024] [Code]

AID: Attention Interpolation of Text-to-Image Diffusion
[Website] [Code]

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
[Website] [Code]

ORES: Open-vocabulary Responsible Visual Synthesis
[Website] [Code]

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
[Website] [Code]

Detector Guidance for Multi-Object Text-to-Image Generation
[Website] [Code]

Designing a Better Asymmetric VQGAN for StableDiffusion
[Website] [Code]

FABRIC: Personalizing Diffusion Models with Iterative Feedback
[Website] [Code]

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
[Website] [Code]

Progressive Text-to-Image Diffusion with Soft Latent Direction
[Website] [Code]

Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy
[Website] [Code]

If at First You Don’t Succeed, Try, Try Again:Faithful Diffusion-based Text-to-Image Generation by Selection
[Website] [Code]

LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
[Website] [Code]

Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
[Website] [Code]

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning
[Website] [Code]

AltDiffusion: A Multilingual Text-to-Image Diffusion Model
[Website] [Code]

It is all about where you start: Text-to-image generation with seed selection
[Website] [Code]

End-to-End Diffusion Latent Optimization Improves Classifier Guidance
[Website] [Code]

Correcting Diffusion Generation through Resampling
[Website] [Code]

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
[Website] [Code]

A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
[Website] [Code]

PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement
[Website] [Code]

Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
[Website] [Code]

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
[Website] [Code]

LightIt: Illumination Modeling and Control for Diffusion Models
[CVPR 2024] [Project]

UniFL: Improve Stable Diffusion via Unified Feedback Learning
[Website] [Project]

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
[Website] [Project]

Semantic Guidance Tuning for Text-To-Image Diffusion Models
[Website] [Project]

Amazing Combinatorial Creation: Acceptable Swap-Sampling for Text-to-Image Generation
[Website] [Project]

Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation
[Website] [Project]

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
[Website] [Project]

FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes
[Website] [Project]

Lazy Diffusion Transformer for Interactive Image Editing
[Website] [Project]

Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
[Website] [Project]

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
[Website] [Project]

Norm-guided latent space exploration for text-to-image generation
[NeurIPS 2023] [Website]

Improving Diffusion-Based Image Synthesis with Context Prediction
[NeurIPS 2023] [Website]

On Mechanistic Knowledge Localization in Text-to-Image Generative Models
[ICML 2024]

Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers
[Website]

Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control
[Website]

Aligning Diffusion Models by Optimizing Human Utility
[Website]

Instruct-Imagen: Image Generation with Multi-modal Instruction
[Website]

CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
[Website]

MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask
[Website]

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
[Website]

Text2Layer: Layered Image Generation using Latent Diffusion Model
[Website]

Stimulating the Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling
[Website]

A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
[Website]

UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion
[Website]

Improving Compositional Text-to-image Generation with Large Vision-Language Models
[Website]

Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else
[Website]

Unseen Image Synthesis with Diffusion Models
[Website]

AnyLens: A Generative Diffusion Model with Any Rendering Lens
[Website]

Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering
[Website]

Text2Street: Controllable Text-to-image Generation for Street Views
[Website]

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
[Website]

Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion Model
[Website]

Debiasing Text-to-Image Diffusion Models
[Website]

Stochastic Conditional Diffusion Models for Semantic Image Synthesis
[Website]

Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
[Website]

Transparent Image Layer Diffusion using Latent Transparency
[Website]

Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation
[Website]

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
[Website]

StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models
[Website]

Make Me Happier: Evoking Emotions Through Image Diffusion Models
[Website]

Zippo: Zipping Color and Transparency Distributions into a Single Diffusion Model
[Website]

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
[Website]

AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
[Website]

U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models
[Website]

ECNet: Effective Controllable Text-to-Image Diffusion Models
[Website]

TextCraftor: Your Text Encoder Can be Image Quality Controller
[Website]

Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
[Website]

Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding
[Website]

Towards Better Text-to-Image Generation Alignment via Attention Modulation
[Website]

Spatial Control

MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
[ICML 2023] [ICML 2023] [Website] [Project] [Code] [Diffusers Code] [Diffusers Doc] [Replicate Demo]

SceneComposer: Any-Level Semantic Image Synthesis
[CVPR 2023 Highlight] [Website] [Project] [Code]

GLIGEN: Open-Set Grounded Text-to-Image Generation
[CVPR 2023] [Website] [Code] [Demo]

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
[ICLR 2023] [Website] [Project] [Code]

Visual Programming for Text-to-Image Generation and Evaluation
[NeurIPS 2023] [Website] [Project] [Code]

GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation
[ICLR 2024] [Website] [Project] [Code]

ReCo: Region-Controlled Text-to-Image Generation
[CVPR 2023] [Website] [Code]

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
[ICCV 2023] [Website] [Code]

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
[ICCV 2023] [Website] [Code]

Dense Text-to-Image Generation with Attention Modulation
[ICCV 2023] [Website] [Code]

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
[Website] [Project] [Code] [Demo] [Blog]

Training-Free Layout Control with Cross-Attention Guidance
[Website] [Project] [Code]

Directed Diffusion: Direct Control of Object Placement through Attention Guidance
[Website] [Project] [Code]

Grounded Text-to-Image Synthesis with Attention Refocusing
[Website] [Project] [Code]

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
[Website] [Project] [Code]

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
[Website] [Project] [Code]

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
[Website] [Project] [Code]

R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
[Website] [Project] [Code]

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
[Website] [Project] [Code]

InstanceDiffusion: Instance-level Control for Image Generation
[Website] [Project] [Code]

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
[CVPR 2024] [Code]

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
[CVPR 2024] [Code]

StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control
[CVPR 2024] [Code]

Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation
[Website] [Code]

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
[Website] [Code]

DivCon: Divide and Conquer for Progressive Text-to-Image Generation
[Website] [Code]

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
[Website] [Code]

StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control
[Website] [Code]

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
[Website] [Project]

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
[Website] [Project]

Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
[Website] [Project]

ReGround: Improving Textual and Spatial Grounding at No Cost
[Website] [Project]

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
[CVPR 2024]

Guided Image Synthesis via Initial Image Editing in Diffusion Model
[ACM MM 2023]

GLoD: Composing Global Contexts and Local Details in Image Generation
[Website]

Enhancing Prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion
[Website]

A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis
[Website]

Controllable Text-to-Image Generation with GPT-4
[Website]

Localized Text-to-Image Generation for Free via Cross Attention Control
[Website]

Training-Free Location-Aware Text-to-Image Synthesis
[Website]

Composite Diffusion | whole >= \Sigma parts
[Website]

Continuous Layout Editing of Single Images with Diffusion Models
[Website]

Zero-shot spatial layout conditioning for text-to-image diffusion models
[Website]

Obtaining Favorable Layouts for Multiple Object Generation
[Website]

Enhancing Object Coherence in Layout-to-Image Synthesis
[Website]

LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis
[Website]

Self-correcting LLM-controlled Diffusion Models
[Website]

Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis
[Website]

Joint Generative Modeling of Scene Graphs and Images via Diffusion Models
[Website]

Spatial-Aware Latent Initialization for Controllable Image Generation
[Website]

Layout-to-Image Generation with Localized Descriptions using ControlNet with Cross-Attention Control
[Website]

ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation
[Website]

I2I translation

⭐⭐⭐SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
[ICLR 2022] [Website] [Project] [Code]

CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation
[NeurIPS 2023] [Website] [Project] [Code]

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
[CVPR 2024] [Project] [Code]

DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
[CVPR 2022] [Website] [Code]

Diffusion-based Image Translation using Disentangled Style and Content Representation
[ICLR 2023] [Website] [Code]

FlexIT: Towards Flexible Semantic Image Translation
[CVPR 2022] [Website] [Code]

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
[ICCV 2023] [Website] [Code]

Cross-Image Attention for Zero-Shot Appearance Transfer
[Website] [Project] [Code]

Diffusion Guided Domain Adaptation of Image Generators
[Website] [Project] [Code]

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
[Website] [Project] [Code]

BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models
[CVPR 2023] [Code]

Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance
[Website] [Code]

GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis
[Website] [Code]

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion
[Website] [Code]

PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering
[Website] [Code]

One-Step Image Translation with Text-to-Image Models
[Website] [Code]

FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models
[Website] [Project]

StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
[ICCV 2023] [Website]

One-Shot Structure-Aware Stylized Image Synthesis
[CVPR 2024]

ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors
[ACM MM 2023]

High-Fidelity Diffusion-based Image Editing
[AAAI 2024]

Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile
[AAAI 2024]

E2GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
[Website]

UniHDA: Towards Universal Hybrid Domain Adaptation of Image Generators
[Website]

FilterPrompt: Guiding Image Transfer in Diffusion Models
[Website]

Segmentation Detection Tracking

odise: open-vocabulary panoptic segmentation with text-to-image diffusion modelss
[CVPR 2023 Highlight] [Project] [Code] [Demo]

LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation
[ICCV 2023] [Website] [Project] [Code]

Stochastic Segmentation with Conditional Categorical Diffusion Models
[ICCV 2023] [Website] [Code]

DDP: Diffusion Model for Dense Visual Prediction
[ICCV 2023] [Website] [Code]

DiffusionDet: Diffusion Model for Object Detection
[ICCV 2023] [Website] [Code]

OVTrack: Open-Vocabulary Multiple Object Tracking
[CVPR 2023] [Website] [Project]

SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process
[NeurIPS 2023] [Website] [Code]

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction
[CVPR 2024] [Project] [Code]

Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
[Website] [Project] [Code]

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
[Website] [Project] [Code]

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking
[Website] [Code]

Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label
[Website] [Code]

Personalize Segment Anything Model with One Shot
[Website] [Code]

DiffusionTrack: Diffusion Model For Multi-Object Tracking
[Website] [Code]

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
[Website] [Code]

A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
[Website] [Code]

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation
[Website] [Code]

UniGS: Unified Representation for Image Generation and Segmentation
[Website] [Code]

Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
[Website] [Code]

MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation
[Website] [Code]

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
[Website] [Code]

Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
[Website] [Code]

EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
[ICLR 2024] [Website] [Project]

Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
[CVPR 2024] [Project]

FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models
[Website] [Project]

DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
[Website] [Project]

Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation
[ICCV 2023] [Website]

SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection
[CVPR 2024]

Generalization by Adaptation: Diffusion-Based Domain Extension for Domain-Generalized Semantic Segmentation
[WACV 2024]

SLiMe: Segment Like Me
[Website]

ASAM: Boosting Segment Anything Model with Adversarial Tuning
[Website]

MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation
[Website]

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery
[Website]

Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models
[Website]

Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter
[Website]

Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion
[Website]

From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
[Website]

Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation
[Website]

Patch-based Selection and Refinement for Early Object Detection
[Website]

TrackDiffusion: Multi-object Tracking Data Generation via Diffusion Models
[Website]

Towards Granularity-adjusted Pixel-level Semantic Annotation
[Website]

Gen2Det: Generate to Detect
[Website]

Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
[Website]

ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model
[Website]

Additional conditions

⭐⭐⭐Adding Conditional Control to Text-to-Image Diffusion Models
[ICCV 2023 best paper] [Website] [Official Code] [Diffusers Doc] [Diffusers Code]

⭐⭐T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
[Website] [Official Code] [Diffusers Code]

SketchKnitter: Vectorized Sketch Generation with Diffusion Models
[ICLR 2023 Spotlight] [ICLR 2023 Spotlight] [Website] [Code]

Freestyle Layout-to-Image Synthesis
[CVPR 2023 highlight] [Website] [Project] [Code]

Collaborative Diffusion for Multi-Modal Face Generation and Editing
[CVPR 2023] [Website] [Project] [Code]

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation
[ICCV 2023] [Website] [Project] [Code]

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
[ICCV 2023] [Website] [Code]

Sketch-Guided Text-to-Image Diffusion Models
[SIGGRAPH 2023] [Project] [Code]

Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive
[ICLR 2024] [Project] [Code]

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
[Website] [Project] [Code]

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
[Website] [Project] [Code]

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
[Website] [Project] [Code]

Late-Constraint Diffusion Guidance for Controllable Image Synthesis
[Website] [Project] [Code]

Composer: Creative and controllable image synthesis with composable conditions
[Website] [Project] [Code]

DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models
[Website] [Project] [Code]

Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation
[Website] [Project] [Code]

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
[Website] [Project] [Code]

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
[Website] [Project] [Code]

LooseControl: Lifting ControlNet for Generalized Depth Conditioning
[Website] [Project] [Code]

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
[Website] [Project] [Code]

ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models
[Website] [Project] [Code]

ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
[Website] [Project] [Code]

SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior
[Website] [Project] [Code]

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
[ICLR 2024] [Code]

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models
[CVPR 2024] [Code]

Universal Guidance for Diffusion Models
[Website] [Code]

Late-Constraint Diffusion Guidance for Controllable Image Synthesis
[Website] [Code]

Meta ControlNet: Enhancing Task Adaptation via Meta Learning
[Website] [Code]

Local Conditional Controlling for Text-to-Image Diffusion Models
[Website] [Code]

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
[SIGGRAPH 2023] [Project]

SpaText: Spatio-Textual Representation for Controllable Image Generation
[CVPR 2023] [Project]

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
[Website] [Project]

BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
[Website] [Project]

CCM: Adding Conditional Controls to Text-to-Image Consistency Models
[Website] [Project]

FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection
[Website] [Project]

Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor
[Website] [Project]

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
[Website] [Project]

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
[Website]

Conditioning Diffusion Models via Attributes and Semantic Masks for Face Generation
[Website]

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt
[Website]

Adding 3D Geometry Control to Diffusion Models
[Website]

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation
[Website]

JointNet: Extending Text-to-Image Diffusion for Dense Distribution Modeling
[Website]

ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
[Website]

Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons
[Website]

Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt
[Website]

FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation
[Website]

Few-Shot

Discriminative Diffusion Models as Few-shot Vision and Language Learners
[Website] [Code]

Few-Shot Diffusion Models
[Website] [Code]

Few-shot Semantic Image Synthesis with Class Affinity Transfer
[CVPR 2023] [Website]

DiffAlign : Few-shot learning using diffusion based synthesis and alignment
[Website]

Few-shot Image Generation with Diffusion Models
[Website]

Lafite2: Few-shot Text-to-Image Generation
[Website]

SD-inpaint

Paint by Example: Exemplar-based Image Editing with Diffusion Models
[CVPR 2023] [Website] [Code] [Diffusers Doc] [Diffusers Code]

GLIDE: Towards photorealistic image generation and editing with text-guided diffusion model
[ICML 2022 Spotlight] [Website] [Code]

Blended Diffusion for Text-driven Editing of Natural Images
[CVPR 2022] [Website] [Project] [Code]

Blended Latent Diffusion
[SIGGRAPH 2023] [Project] [Code]

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
[ICCV 2023] [Website] [Project] [Code]

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
[CVPR 2023] [Website] [Code]

Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models
[ICML 2023] [Website] [Code]

Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
[Website] [Project] [Code] [Demo]

Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting
[Website] [Project] [Code]

AnyDoor: Zero-shot Object-level Image Customization
[Website] [Project] [Code]

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting
[Website] [Project] [Code]

Towards Language-Driven Video Inpainting via Multimodal Large Language Models
[Website] [Project] [Code]

360-Degree Panorama Generation from Few Unregistered NFoV Images
[ACM MM 2023] [Code]

Delving Globally into Texture and Structure for Image Inpainting
[ACM MM 2022] [Code]

Training-and-prompt-free General Painterly Harmonization Using Image-wise Attention Sharing
[Website] [Code]

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting
[Website] [Code]

Reference-based Image Composition with Sketch via Structure-aware Diffusion Model
[Website] [Code]

Image Inpainting via Iteratively Decoupled Probabilistic Modeling
[Website] [Code]

ControlCom: Controllable Image Composition using Diffusion Model
[Website] [Code]

Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model
[Website] [Code]

MAGICREMOVER: TUNING-FREE TEXT-GUIDED IMAGE INPAINTING WITH DIFFUSION MODELS
[Website] [Code]

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
[Website] [Code]

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
[Website] [Code]

Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
[Website] [Code]

IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation
[CVPR 2024] [Project]

Taming Latent Diffusion Model for Neural Radiance Field Inpainting
[Website] [Project]

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control
[Website] [Project]

Towards Stable and Faithful Inpainting
[Website] [Project]

Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos
[Website] [Project]

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
[Website] [Project]

Semantically Consistent Video Inpainting with Conditional Diffusion Models
[Website]

Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention
[Website]

Outline-Guided Object Inpainting with Diffusion Models
[Website]

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
[Website]

Gradpaint: Gradient-Guided Inpainting with Diffusion Models
[Website]

Infusion: Internal Diffusion for Video Inpainting
[Website]

Rethinking Referring Object Removal
[Website]

Tuning-Free Image Customization with Image and Text Guidance
[Website]

Layout Generation

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
[CVPR 2023] [Website] [Project] [Code]

Desigen: A Pipeline for Controllable Design Template Generation
[CVPR 2024] [Project] [Code]

DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer
[ICCV 2023] [Website] [Code]

LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models
[ICCV 2023] [Website] [Code]

Desigen: A Pipeline for Controllable Design Template Generation
[CVPR 2024] [Code]

LayoutDM: Transformer-based Diffusion Model for Layout Generation
[CVPR 2023] [Website]

Unifying Layout Generation with a Decoupled Diffusion Model
[CVPR 2023] [Website]

PLay: Parametrically Conditioned Layout Generation using Latent Diffusion
[ICML 2023] [Website]

Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints
[ICLR 2024]

Diffusion-based Document Layout Generation
[Website]

Dolfin: Diffusion Layout Transformers without Autoencoder
[Website]

LayoutFlow: Flow Matching for Layout Generation
[Website]

Text Generation

TextDiffuser: Diffusion Models as Text Painters
[NeurIPS 2023] [Website] [Project] [Code]

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
[Website] [Code]

GlyphControl: Glyph Conditional Control for Visual Text Generation
[NeurIPS 2023] [Website] [Code]

DiffUTE: Universal Text Editing Diffusion Model
[NeurIPS 2023] [Website] [Code]

Word-As-Image for Semantic Typography
[SIGGRAPH 2023] [Project] [Code]

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models
[Website] [Project] [Code]

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model
[AAAI 2024] [Code]

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
[AAAI 2024] [Code]

Text Image Inpainting via Global Structure-Guided Diffusion Models
[AAAI 2024] [Code]

Ambigram generation by a diffusion model
[ICDAR 2023] [Code]

Scene Text Image Super-resolution based on Text-conditional Diffusion Models
[WACV 2024] [Code]

AnyText: Multilingual Visual Text Generation And Editing
[Website] [Code]

Few-shot Calligraphy Style Learning
[Website] [Code]

AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model
[Website] [Project]

UniVG: Towards UNIfied-modal Video Generation
[Website] [Project]

DECDM: Document Enhancement using Cycle-Consistent Diffusion Models
[WACV 2024]

VecFusion: Vector Font Generation with Diffusion
[Website]

Typographic Text Generation with Off-the-Shelf Diffusion Model
[Website]

Font Style Interpolation with Diffusion Models
[Website]

Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
[Website]

DiffCJK: Conditional Diffusion Model for High-Quality and Wide-coverage CJK Character Generation
[Website]

Super Resolution

ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting
[NeurIPS 2023 spotlight] [Website] [Project] [Code]

Image Super-Resolution via Iterative Refinement
[TPAMI] [Website] [Project] [Code]

DiffIR: Efficient Diffusion Model for Image Restoration
[ICCV 2023] [Website] [Code]

AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation
[Website] [Project] [Code]

Exploiting Diffusion Prior for Real-World Image Super-Resolution
[Website] [Project] [Code]

SinSR: Diffusion-Based Image Super-Resolution in a Single Step
[CVPR 2024] [Code]

Iterative Token Evaluation and Refinement for Real-World Super-Resolution
[AAAI 2024] [Code]

DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion
[Website] [Code]

Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach
[Website] [Code]

Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized Stylization
[Website] [Code]

DSR-Diff: Depth Map Super-Resolution with Diffusion Model
[Website] [Code]

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution
[Website] [Code]

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
[Website] [Code]

Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution
[Website] [Code]

BlindDiff: Empowering Degradation Modelling in Diffusion Models for Blind Image Super-Resolution
[Website] [Code]

HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models
[ICCV 2023] [Website]

Text-guided Explorable Image Super-resolution
[CVPR 2024]

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder
[CVPR 2024]

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network
[AAAI 2024]

Detail-Enhancing Framework for Reference-Based Image Super-Resolution
[Website]

You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
[Website]

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution
[Website]

Dissecting Arbitrary-scale Super-resolution Capability from Pre-trained Diffusion Generative Models
[Website]

YODA: You Only Diffuse Areas. An Area-Masked Diffusion Approach For Image Super-Resolution
[Website]

Domain Transfer in Latent Space (DTLS) Wins on Image Super-Resolution -- a Non-Denoising Model
[Website]

Image Super-Resolution with Text Prompt Diffusio
[Website]

DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution
[Website]

DREAM: Diffusion Rectification and Estimation-Adaptive Models
[Website]

Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
[Website]

Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution
[Website]

CasSR: Activating Image Power for Real-World Image Super-Resolution
[Website]

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution
[Website]

Video Generation

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
[ICCV 2023 Oral] [Website] [Project] [Code]

SinFusion: Training Diffusion Models on a Single Image or Video
[ICML 2023] [Website] [Project] [Code]

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
[CVPR 2023] [Website] [Project] [Code]

MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation
[NeurIPS 2022] [Website] [Project] [Code]

GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided Video DecodER
[NeurIPS 2023] [Website] [Code]

Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
[NeurIPS 2023] [Website] [Code]

Conditional Image-to-Video Generation with Latent Flow Diffusion Models
[CVPR 2023] [Website] [Code]

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
[CVPR 2023] [Project] [Code]

Video Diffusion Models
[ICLR 2022 workshop] [Website] [Code] [Project]

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
[Website] [Diffusers Doc] [Project] [Code]

MotionMaster: Training-free Camera Motion Transfer For Video Generation
[Website] [Project] [Code]

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
[Website] [Project] [Code]

Motion Inversion for Video Customization
[Website] [Project] [Code]

MagicAvatar: Multimodal Avatar Generation and Animation
[Website] [Project] [Code]

TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
[Website] [Project] [Code]

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
[Website] [Project] [Code]

Breathing Life Into Sketches Using Text-to-Video Priors
[Website] [Project] [Code]

Latent Video Diffusion Models for High-Fidelity Long Video Generation
[Website] [Project] [Code]

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
[Website] [Project] [Code]

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
[Website] [Project] [Code]

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
[Website] [Project] [Code]

VideoComposer: Compositional Video Synthesis with Motion Controllability
[Website] [Project] [Code]

DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
[Website] [Project] [Code]

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
[Website] [Project] [Code]

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
[Website] [Project] [Code]

LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
[Website] [Project] [Code]

MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer
[Website] [Project] [Code]

LLM-GROUNDED VIDEO DIFFUSION MODELS
[Website] [Project] [Code]

FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
[Website] [Project] [Code]

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
[Website] [Project] [Code]

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
[Website] [Project] [Code]

VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning
[Website] [Project] [Code]

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
[Website] [Project] [Code]

FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
[Website] [Project] [Code]

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
[Website] [Project] [Code]

ART⋅V: Auto-Regressive Text-to-Video Generation with Diffusion Models
[Website] [Project] [Code]

FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax
[Website] [Project] [Code]

VideoBooth: Diffusion-based Video Generation with Image Prompts
[Website] [Project] [Code]

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
[Website] [Project] [Code]

LivePhoto: Real Image Animation with Text-guided Motion Control
[Website] [Project] [Code]

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
[Website] [Project] [Code]

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
[Website] [Project] [Code]

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
[Website] [Project] [Code]

DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models
[Website] [Project] [Code]

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
[Website] [Project] [Code]

FreeInit: Bridging Initialization Gap in Video Diffusion Models
[Website] [Project] [Code]

Text2AC-Zero: Consistent Synthesis of Animated Characters using 2D Diffusion
[Website] [Project] [Code]

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter
[Website] [Project] [Code]

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
[Website] [Project] [Code]

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
[Website] [Project] [Code]

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
[Website] [Project] [Code]

Latte: Latent Diffusion Transformer for Video Generation
[Website] [Project] [Code]

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
[Website] [Project] [Code]

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
[Website] [Project] [Code]

Towards A Better Metric for Text-to-Video Generation
[Website] [Project] [Code]

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
[Website] [Project] [Code]

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
[Website] [Project] [Code]

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
[Website] [Project] [Code]

VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
[Website] [Project] [Code]

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
[Website] [Project] [Code]

FlexiFilm: Long Video Generation with Flexible Conditions
[Website] [Project] [Code]

TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
[Website] [Project] [Code]

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing
[ICLR 2024] [Code]

SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces
[ICLR 2024] [Code]

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model
[Website] [Code]

Diffusion Probabilistic Modeling for Video Generation
[Website] [Code]

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
[Website] [Code]

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
[Website] [Code]

STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video Prediction
[Website] [Code]

Vlogger: Make Your Dream A Vlog
[Website] [Code]

Magic-Me: Identity-Specific Video Customized Diffusion
[Website] [Code]

VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
[Website] [Code]

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models
[Website] [Code]

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
[Website] [Code]

TAVGBench: Benchmarking Text to Audible-Video Generation
[Website] [Code]

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
[CVPR 2024] [Project]

AtomoVideo: High Fidelity Image-to-Video Generation
[CVPR 2024] [Project]

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
[ICLR 2024] [Project]

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models
[CVPR 2024] [Project]

AniClipart: Clipart Animation with Text-to-Video Priors
[Website] [Project]

Spectral Motion Alignment for Video Motion Transfer using Diffusion Models
[Website] [Project]

TimeRewind: Rewinding Time with Image-and-Events Video Diffusion
[Website] [Project]

VideoPoet: A Large Language Model for Zero-Shot Video Generation
[Website] [Project]

PEEKABOO: Interactive Video Generation via Masked-Diffusion
[Website] [Project]

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
[Website] [Project]

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
[Website] [Project]

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
[Website] [Project]

Imagen Video: High Definition Video Generation with Diffusion Models
[Website] [Project]

MoVideo: Motion-Aware Video Generation with Diffusion Models
[Website] [Project]

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer
[Website] [Project]

Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning
[Website] [Project]

VideoAssembler: Identity-Consistent Video Generation with Reference Entities using Diffusion Model
[Website] [Project]

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
[Website] [Project]

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models
[Website] [Project]

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
[Website] [Project]

Customizing Motion in Text-to-Video Diffusion Models
[Website] [Project]

Photorealistic Video Generation with Diffusion Models
[Website] [Project]

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
[Website] [Project]

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
[Website] [Project]

ActAnywhere: Subject-Aware Video Background Generation
[Website] [Project]

Lumiere: A Space-Time Diffusion Model for Video Generation
[Website] [Project]

InstructVideo: Instructing Video Diffusion Models with Human Feedback
[Website] [Project]

Boximator: Generating Rich and Controllable Motions for Video Synthesis
[Website] [Project]

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
[Website] [Project]

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
[Website] [Project]

Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation
[Website] [Project]

Audio-Synchronized Visual Animation
[Website] [Project]

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
[Website] [Project]

S2DM: Sector-Shaped Diffusion Models for Video Generation
[Website] [Project]

AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment
[Website] [Project]

TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
[Website] [Project]

Grid Diffusion Models for Text-to-Video Generation
[Website]

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
[Website]

Dual-Stream Diffusion Net for Text-to-Video Generation
[Website]

SimDA: Simple Diffusion Adapter for Efficient Video Generation
[Website]

VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
[Website]

Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models
[Website]

ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation
[Website]

LatentWarp: Consistent Diffusion Latents for Zero-Shot Video-to-Video Translation
[Website]

Optimal Noise pursuit for Augmenting Text-to-Video Generation
[Website]

Make Pixels Dance: High-Dynamic Video Generation
[Website]

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
[Website]

Highly Detailed and Temporal Consistent Video Stylization via Synchronized Multi-Frame Diffusion
[Website]

Decouple Content and Motion for Conditional Image-to-Video Generation
[Website]

X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention
[Website]

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
[Website]

MTVG : Multi-text Video Generation with Text-to-Video Models
[Website]

VideoLCM: Video Latent Consistency Model
[Website]

MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
[Website]

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
[Website]

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model
[Website]

CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
[Website]

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation
[Website]

Training-Free Semantic Video Composition via Pre-trained Diffusion Model
[Website]

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
[Website]

Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models
[Website]

Human Video Translation via Query Warping
[Website]

Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation
[Website]

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
[Website]

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
[Website]

Context-aware Talking Face Video Generation
[Website]

Pix2Gif: Motion-Guided Diffusion for GIF Generation
[Website]

Intention-driven Ego-to-Exo Video Generation
[Website]

AnimateDiff-Lightning: Cross-Model Diffusion Distillation
[Website]

Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
[Website]

Matten: Video Generation with Mamba-Attention
[Website]

Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models
[Website]

Video Editing

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
[ICCV 2023 Oral] [Website] [Project] [Code]

Text2LIVE: Text-Driven Layered Image and Video Editing
[ECCV 2022 Oral] [Project] [code]

Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
[CVPR 2023] [Project] [Code]

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[ICCV 2023] [Project] [Code]

StableVideo: Text-driven Consistency-aware Diffusion Video Editing
[ICCV 2023] [Website] [Code]

Video-P2P: Video Editing with Cross-attention Control
[Website] [Project] [Code]

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
[Website] [Project] [Code]

MagicEdit: High-Fidelity and Temporally Coherent Video Editing
[Website] [Project] [Code]

TokenFlow: Consistent Diffusion Features for Consistent Video Editing
[Website] [Project] [Code]

ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing
[Website] [Project] [Code]

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
[Website] [Project] [Code]

MotionDirector: Motion Customization of Text-to-Video Diffusion Models
[Website] [Project] [Code]

EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
[Website] [Project] [Code]

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models
[Website] [Project] [Code]

Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
[Website] [Project] [Code]

MotionEditor: Editing Video Motion via Content-Aware Diffusion
[Website] [Project] [Code]

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
[Website] [Project] [Code]

MagicStick: Controllable Video Editing via Control Handle Transformations
[Website] [Project] [Code]

VidToMe: Video Token Merging for Zero-Shot Video Editing
[Website] [Project] [Code]

VASE: Object-Centric Appearance and Shape Manipulation of Real Videos
[Website] [Project] [Code]

Neural Video Fields Editing
[Website] [Project] [Code]

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
[Website] [Project] [Code]

Vid2Vid-zero: Zero-Shot Video Editing Using Off-the-Shelf Image Diffusion Models
[Website] [Code]

DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization
[Website] [Code]

LOVECon: Text-driven Training-Free Long Video Editing with ControlNet
[Website] [Code]

Pix2video: Video Editing Using Image Diffusion
[Website] [Code]

Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
[Website] [Code]

Flow-Guided Diffusion for Video Inpainting
[Website] [Code]

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models
[Website] [Code]

Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing
[Website] [Code]

Shape-Aware Text-Driven Layered Video Editing
[CVPR 2023] [Website] [Project]

DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
[Website] [Project]

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
[Website] [Project]

VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
[Website] [Project]

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
[Website] [Project]

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
[Website] [Project]

MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance
[Website] [Project]

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models
[Website] [Project]

DreamMotion: Space-Time Self-Similarity Score Distillation for Zero-Shot Video Editing
[Website] [Project]

Edit Temporal-Consistent Videos with Image Diffusion Model
[Website]

Cut-and-Paste: Subject-Driven Video Editing with Attention Control
[Website]

MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation
[Website]

Dreamix: Video Diffusion Models Are General Video Editors
[Website]

Towards Consistent Video Editing with Text-to-Image Diffusion Models
[Website]

EVE: Efficient zero-shot text-based Video Editing with Depth Map Guidance and Temporal Consistency Constraints
[Website]

CCEdit: Creative and Controllable Video Editing via Diffusion Models
[Website]

Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
[Website]

FastBlend: a Powerful Model-Free Toolkit Making Video Stylization Easier
[Website]

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
[Website]

RealCraft: Attention Control as A Solution for Zero-shot Long Video Editing
[Website]

Object-Centric Diffusion for Efficient Video Editing
[Website]

FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing
[Website]

Video Editing via Factorized Diffusion Distillation
[Website]

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
[Website]

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion
[Website]

GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I Diffusion Models
[Website]

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Awesome Diffusion Categorized

Contents

Storytelling

Try On

Drag Edit

Diffusion Models Inversion

Text Guided Image Editing

Continual Learning

Remove Concept

New Concept Learning

T2I Diffusion Model augmentation

Spatial Control

I2I translation

Segmentation Detection Tracking

Additional conditions

Few-Shot

SD-inpaint

Layout Generation

Text Generation

Super Resolution

Video Generation

Video Editing

About

Releases

Packages

Contributors 7

wangkai930418/awesome-diffusion-categorized

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Awesome Diffusion Categorized

Contents

Storytelling

Try On

Drag Edit

Diffusion Models Inversion

Text Guided Image Editing

Continual Learning

Remove Concept

New Concept Learning

T2I Diffusion Model augmentation

Spatial Control

I2I translation

Segmentation Detection Tracking

Additional conditions

Few-Shot

SD-inpaint

Layout Generation

Text Generation

Super Resolution

Video Generation

Video Editing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Packages