Feb. 9, 2024, 5:46 a.m. | Senmao Li Joost van de Weijer Taihang Hu Fahad Shahbaz Khan Qibin Hou Yaxing Wang Jian Yang

cs.CV updates on arXiv.org arxiv.org

The success of recent text-to-image diffusion models is largely due to their capacity to be guided by a complex text prompt, which enables users to precisely describe the desired content. However, these models struggle to effectively suppress the generation of undesired content, which is explicitly requested to be omitted from the generated image in the prompt. In this paper, we analyze how to manipulate the text embeddings and remove unwanted content from them. We introduce two contributions, which we refer …

