Multi-MELO: Unified Multimodal Model Editing with Dynamic LoRA

This repo contains the source code of our proposed Multi-MELO, a unified multimodel model editing method, which supports edting for different network architectures on both the image-to-text and text-to-image tasks.

Updates

2023/12/14: Experiments on editing LDM for personalization during text-to-image generation. 🎨
2023/10/29: Experiments on edting BLIP-2 OPT of VQA and Image Captioning 🎊
2023/09/02: Pulling the Vector Database out from layer forward pass. ⭐

Introduction

Model editing aims to correct hallucinations or incorporate new knowledge into the pre-trained model. Most previous work focuses on model editing with merely the textual modality, while editing for multimodal models is not well studied. Recent research turns to investigate how to adapt the language model editors into the multimodal scenarios. Whereas, these methods are limited to the image-to-text tasks and similar model architectures. The text-to-image editing task has not been explored, which poses big challenges concerning the significant diversity of complex network architectures. In this paper, we propose a unified multimodal model editing framework based on dynamic LoRA (Multi-MELO), which enables effective editing for various multimodal models by dynamically activating corresponding LoRA blocks that encode the related knowledge. We explore the framework for editing diverse multimodal models (i.e., BLIP-2 and latent diffusion model) on three downstream tasks, including image captioning, visual question answering and text-to-image generation.

Experiments

Main results of experiments based on BLIP-2 and Multi-MELO.

Prepare Environments

Required CUDA environment and library dependencies are listed in:

requirements.txt

Then you should install our modified PEFT:

🤗 PEFT-Multi-MELO

cd peft_egg
pip install -e .

Detailed implementation of MELO is in ./peft_egg/src/perf/melo.py

Prepare Datasets

Datasets for both the editing and evaluation are listed as follow:

Please refer to EasyEdit for the Dataset of Editing and Evaluation on VQA.
Please refer to EasyEdit for the Dataset of Editing and Evaluation on Caption.
Please refer to DreamBooth for the Dataset of Editing and Evaluation on ImGen.

Quick Start

Following the hydra format, configurations for editing BLIP-2 and LDM are provided in ./BLIP2_melo/config and ./DreamBooth_melo/config

Editing BLIP-2 on VQA with Multi-MELO

cd BLIP_melo
python run_VL.py +alg=lora_blip +experiment=vqa +model=blip2

Editing BLIP-2 on Caption with Multi-MELO

cd DreamBooth_melo
python run_VL.py +alg=lora_blip +experiment=caption +model=blip2

Editing LDM on ImGen with Multi-MELO

cd DreamBooth_melo
python ft_run.py +algs=ft_diff +experiment=diffusion +model=dreambooth
python lora_run.py +algs=lora_diff +experiment=diffusion +model=dreambooth
python melo_run.py +algs=melo_diff +experiment=diffusion +model=dreambooth

Acknowledgments

We would like to thank the following individuals and organizations for their contributions to this project:

Huggingface: for their support of the PEFT community and their development of the PEFT framework (https://github.com/huggingface/peft)
MELO: for the development of the open-source library MELO which inspired our work (https://github.com/ECNU-ICALK/MELO)
EasyEdit: for the development of the open-source Dataset (https://github.com/zjunlp/EasyEdit)
DreamBooth: for the development of the open-source Dataset (https://github.com/google/dreambooth)

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
BLIP2_melo		BLIP2_melo
DreamBooth_melo		DreamBooth_melo
base		base
figures		figures
peft-egg		peft-egg
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLIP2_melo

BLIP2_melo

DreamBooth_melo

DreamBooth_melo

base

base

figures

figures

peft-egg

peft-egg

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Multi-MELO: Unified Multimodal Model Editing with Dynamic LoRA

Updates

Table of Contents

Introduction

Experiments

Prepare Environments

🤗 PEFT-Multi-MELO

Prepare Datasets

Quick Start

Editing BLIP-2 on VQA with Multi-MELO

Editing BLIP-2 on Caption with Multi-MELO

Editing LDM on ImGen with Multi-MELO

Acknowledgments

About

Releases

Packages

Languages

BruthYU/MELO-V2

Folders and files

Latest commit

History

Repository files navigation

Multi-MELO: Unified Multimodal Model Editing with Dynamic LoRA

Updates

Table of Contents

Introduction

Experiments

Prepare Environments

🤗 PEFT-Multi-MELO

Prepare Datasets

Quick Start

Editing BLIP-2 on VQA with Multi-MELO

Editing BLIP-2 on Caption with Multi-MELO

Editing LDM on ImGen with Multi-MELO

Acknowledgments

About

Resources

Stars

Watchers

Forks

Languages