Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | allainews.com

March 7, 2024, 5:45 a.m. | Bingyan Liu, Chengyu Wang, Tingfeng Cao, Kui Jia, Jun Huang

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.03431v1 Announce Type: new
Abstract: Deep Text-to-Image Synthesis (TIS) models such as Stable Diffusion have recently gained significant popularity for creative Text-to-image generation. Yet, for domain-specific scenarios, tuning-free Text-guided Image Editing (TIE) is of greater importance for application developers, which modify objects or object properties in images by manipulating feature components in attention layers during the generation process. However, little is known about what semantic meanings these attention layers have learned and which parts of the attention maps contribute to …

abstract application arxiv attention creative cs.cv developers diffusion domain editing free image image generation images importance object objects self-attention stable diffusion synthesis text text-to-image type understanding

More from arxiv.org / cs.CV updates on arXiv.org

Visual Environment Assessment for Safe Autonomous Quadrotor Landing 23 hours ago | arxiv.org

abstract aerial arxiv assessment +21

JPEG Quantized Coefficient Recovery via DCT Domain Spatial-Frequential Transformer 23 hours ago | arxiv.org

abstract arxiv compression cosine +14

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks 23 hours ago | arxiv.org

abstract arxiv audio audio editing +20

Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stress Identification 23 hours ago | arxiv.org

abstract arxiv crops cs.cv +16

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation 23 hours ago | arxiv.org

arxiv cs.cv framework segmentation +5

A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis 23 hours ago | arxiv.org

analysis and analysis arxiv classification +7

Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor 23 hours ago | arxiv.org

abstract arxiv autism children +17

Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning 23 hours ago | arxiv.org

abstract arxiv autonomy commonsense +15

Towards Diverse Binary Segmentation via A Simple yet General Gated Network 23 hours ago | arxiv.org

abstract arxiv basic binary +19

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Machine Learning Research Scientist

@ d-Matrix | San Diego, Ca

View on ai-jobs.net