Skip to content

jenci2114/csc413-project

Repository files navigation

An Image is Worth One Sentence: Fast Textual Inversion with Supreme Initialization

Screenshot 2023-04-22 at 12 55 28 AM

Text-driven image synthesis has emerged as a popular research area. Existing approaches to invert image to text face challenges like requiring multiple images, slow convergence, or overfitting thus suffering from editing capability. In this paper, we propose a novel initialization method for inverting text using off-the-shelf classification or captioning models. This approach enables multi-token embedding learning from a single input image while eliminating the need for fine-tuning and ensuring faster convergence. We demonstrate a significant improvement in convergence speed compared to vanilla TI.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages