June 10, 2024, 4:48 a.m. | Lianyu Pang, Jian Yin, Baoquan Zhao, Feize Wu, Fu Lee Wang, Qing Li, Xudong Mao

arXiv:2406.05000v1 Announce Type: new
Abstract: Recent advances in text-to-image models have enabled high-quality personalized image synthesis of user-provided concepts with flexible textual control. In this work, we analyze the limitations of two primary techniques in text-to-image personalization: Textual Inversion and DreamBooth. When integrating the learned concept into new prompts, Textual Inversion tends to overfit the concept, while DreamBooth often overlooks it. We attribute these issues to the incorrect learning of the embedding alignment for the concept. We introduce AttnDreamBooth, a …

abstract advances analyze arxiv concept concepts control cs.cv dreambooth image image generation limitations personalization personalized prompts quality synthesis text text-to-image textual type work

