March 20, 2024, 4:46 a.m. | Kota Sueyoshi, Takashi Matsubara

cs.CV updates on arXiv.org arxiv.org

arXiv:2311.16117v2 Announce Type: replace
Abstract: Diffusion models have achieved remarkable results in generating high-quality, diverse, and creative images. However, when it comes to text-based image generation, they often fail to capture the intended meaning presented in the text. For instance, a specified object may not be generated, an unnecessary object may be generated, and an adjective may alter objects it was not intended to modify. Moreover, we found that relationships indicating possession between objects are often overlooked. While users' intentions …

abstract arxiv attention creative cs.cv diffusion diffusion models diverse generated guidance however image image diffusion image generation images instance logic meaning object quality results text text-to-image type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Robotics Technician - 3rd Shift

@ GXO Logistics | Perris, CA, US, 92571