Feb. 16, 2024, 5:46 a.m. | Tao Yang, Cuiling Lan, Yan Lu, Nanning zheng

cs.CV updates on arXiv.org arxiv.org

arXiv:2402.09712v1 Announce Type: new
Abstract: Disentangled representation learning strives to extract the intrinsic factors within observed data. Factorizing these representations in an unsupervised manner is notably challenging and usually requires tailored loss functions or specific structural designs. In this paper, we introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias to facilitate the learning of disentangled representations. We propose to encode an image to a set of concept tokens and …

abstract arxiv attention bias cs.ai cs.cv data designs diffusion diffusion model extract framework functions inductive intrinsic loss paper perspective representation representation learning type unsupervised

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States