all AI news
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
April 12, 2024, 4:45 a.m. | Yasi Zhang, Peiyu Yu, Ying Nian Wu
cs.CV updates on arXiv.org arxiv.org
Abstract: Text-to-image diffusion models have shown great success in generating high-quality text-guided images. Yet, these models may still fail to semantically align generated images with the provided text prompts, leading to problems like incorrect attribute binding and/or catastrophic object neglect. Given the pervasive object-oriented structure underlying text prompts, we introduce a novel object-conditioned Energy-Based Attention Map Alignment (EBAMA) method to address the aforementioned problems. We show that an object-centric attribute binding loss naturally emerges by approximately …
abstract alignment arxiv attention cs.cv diffusion diffusion models energy generated image image diffusion images map object object-oriented prompts quality success text text-to-image type
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States