all AI news
Evaluating Text-to-Visual Generation with Image-to-Text Generation
April 2, 2024, 7:44 p.m. | Zhiqiu Lin, Deepak Pathak, Baiqi Li, Jiayao Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan
cs.LG updates on arXiv.org arxiv.org
Abstract: Despite significant progress in generative AI, comprehensive evaluation remains challenging because of the lack of effective metrics and standardized benchmarks. For instance, the widely-used CLIPScore measures the alignment between a (generated) image and text prompt, but it fails to produce reliable scores for complex prompts involving compositions of objects, attributes, and relations. One reason is that text encoders of CLIP can notoriously act as a "bag of words", conflating prompts such as "the horse is …
abstract alignment arxiv benchmarks cs.ai cs.cl cs.cv cs.lg cs.mm evaluation generated generative image image-to-text instance metrics objects progress prompt prompts text text generation type visual
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead Data Engineer
@ WorkMoney | New York City, United States - Remote