all AI news
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs
March 12, 2024, 4:52 a.m. | Alberto Testoni, Juell Sprott, Sandro Pezzelle
cs.CL updates on arXiv.org arxiv.org
Abstract: While human speakers use a variety of different expressions when describing the same object in an image, giving rise to a distribution of plausible labels driven by pragmatic constraints, the extent to which current Vision \& Language Large Language Models (VLLMs) can mimic this crucial feature of language use is an open question. This applies to common, everyday objects, but it is particularly interesting for uncommon or novel objects for which a category label may …
abstract arxiv constraints cs.cl current distribution giving human humans image labels language language models large language large language models llms object objects speakers type vision visual
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne