all AI news
VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models
April 23, 2024, 4:48 a.m. | Haoyi Qiu, Wenbo Hu, Zi-Yi Dou, Nanyun Peng
cs.CV updates on arXiv.org arxiv.org
Abstract: Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs, undermining their reliability. A comprehensive quantitative evaluation is necessary to identify and understand the extent of hallucinations in these models. However, existing benchmarks are often limited in scope, focusing mainly on object hallucinations. Furthermore, current evaluation methods struggle to effectively address the subtle semantic distinctions between model outputs and reference data, as well as the balance between hallucination …
abstract arxiv benchmarks coverage cs.cl cs.cv evaluation generate hallucination hallucinations however identify language language models quantitative reliability type vision vision-language vision-language models
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Data Science Analyst
@ Mayo Clinic | AZ, United States
Sr. Data Scientist (Network Engineering)
@ SpaceX | Redmond, WA