all AI news
Understanding and Mitigating Compositional Issues in Text-to-Image Generative Models
June 13, 2024, 4:45 a.m. | Arman Zarei, Keivan Rezaei, Samyadeep Basu, Mehrdad Saberi, Mazda Moayeri, Priyatham Kattakinda, Soheil Feizi
cs.CV updates on arXiv.org arxiv.org
Abstract: Recent text-to-image diffusion-based generative models have the stunning ability to generate highly detailed and photo-realistic images and achieve state-of-the-art low FID scores on challenging image generation benchmarks. However, one of the primary failure modes of these text-to-image generative models is in composing attributes, objects, and their associated relationships accurately into an image. In our paper, we investigate this compositionality-based failure mode and highlight that imperfect text conditioning with CLIP text-encoder is one of the primary …
abstract art arxiv attributes benchmarks cs.cv diffusion failure generate generative generative models however image image diffusion image generation images low objects photo state text text-to-image type understanding
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
Hybrid Cloud Engineer
@ Vanguard | Wayne, PA
Senior Software Engineer
@ F5 | San Jose
Software Engineer, Backend, 3+ Years of Experience
@ Snap Inc. | Bellevue - 110 110th Ave NE
Global Head of Commercial Data Foundations
@ Sanofi | Cambridge