Feb. 6, 2024, 5:52 a.m. | Zichen Zhu Yang Xu Lu Chen Jingkai Yang Yichuan Ma Yiming Sun Hailin Wen Jiaqi Liu Jin

cs.CV updates on arXiv.org arxiv.org

Rapid progress in multimodal large language models (MLLMs) highlights the need to introduce challenging yet realistic benchmarks to the academic community. Existing benchmarks primarily focus on simple natural image understanding, but Multi emerges as a cutting-edge benchmark for MLLMs, offering a comprehensive dataset for evaluating MLLMs against understanding complex figures and tables, and scientific questions. This benchmark, reflecting current realistic examination styles, provides multimodal inputs and requires responses that are either precise or open-ended, similar to real-life school tests. It …

academic benchmark benchmarks community cs.ai cs.cl cs.cv dataset edge focus highlights image images language language models large language large language models leaderboard mllms multimodal natural progress simple tables text understanding

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York