June 17, 2024, 4:47 a.m. | Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik

cs.CV updates on arXiv.org arxiv.org

arXiv:2406.10210v1 Announce Type: new
Abstract: Despite the unprecedented success of text-to-image diffusion models, controlling the number of depicted objects using text is surprisingly hard. This is important for various applications from technical documents, to children's books to illustrating cooking recipes. Generating object-correct counts is fundamentally challenging because the generative model needs to keep a sense of separate identity for every instance of the object, even if several objects look identical or overlap, and then carry out a global computation implicitly …

abstract applications arxiv books children children's books cooking count cs.ai cs.cv cs.gr diffusion diffusion models documents generative image image diffusion image generation important object objects recipes success technical text text-to-image type

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Senior Data Engineer

@ Displate | Warsaw

Hybrid Cloud Engineer

@ Vanguard | Wayne, PA

Senior Software Engineer

@ F5 | San Jose

Software Engineer, Backend, 3+ Years of Experience

@ Snap Inc. | Bellevue - 110 110th Ave NE

Global Head of Commercial Data Foundations

@ Sanofi | Cambridge