all AI news
Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models
April 22, 2024, 4:45 a.m. | Yian Li, Wentao Tian, Yang Jiao, Jingjing Chen, Yu-Gang Jiang
cs.CV updates on arXiv.org arxiv.org
Abstract: Counterfactual reasoning, as a crucial manifestation of human intelligence, refers to making presuppositions based on established facts and extrapolating potential outcomes. Existing multimodal large language models (MLLMs) have exhibited impressive cognitive and reasoning capabilities, which have been examined across a wide range of Visual Question Answering (VQA) benchmarks. Nevertheless, how will existing MLLMs perform when faced with counterfactual questions? To answer this question, we first curate a novel \textbf{C}ounter\textbf{F}actual \textbf{M}ulti\textbf{M}odal reasoning benchmark, abbreviated as \textbf{CFMM}, …
abstract arxiv benchmarking capabilities cognitive counterfactual cs.ai cs.cv facts human human intelligence intelligence language language models large language large language models making mllms modal multi-modal multimodal reasoning type
More from arxiv.org / cs.CV updates on arXiv.org
Compact 3D Scene Representation via Self-Organizing Gaussian Grids
1 day, 20 hours ago |
arxiv.org
Fingerprint Matching with Localized Deep Representation
1 day, 20 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne