March 15, 2024, 4:45 a.m. | Haiwen Huang, Songyou Peng, Dan Zhang, Andreas Geiger

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.09593v1 Announce Type: new
Abstract: Names are essential to both human cognition and vision-language models. Open-vocabulary models utilize class names as text prompts to generalize to categories unseen during training. However, name qualities are often overlooked and lack sufficient precision in existing datasets. In this paper, we address this underexplored problem by presenting a framework for "renovating" names in open-vocabulary segmentation benchmarks (RENOVATE). Through human study, we demonstrate that the names generated by our model are more precise descriptions of …

abstract arxiv benchmarks class cognition cs.cv datasets however human language language models paper precision presenting prompts segmentation text training type vision vision-language models

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote