Feb. 19, 2024, 5:46 a.m. | Corentin Royer, Bjoern Menze, Anjany Sekuboyina

cs.CV updates on arXiv.org arxiv.org

arXiv:2402.09262v2 Announce Type: replace
Abstract: We introduce MultiMedEval, an open-source toolkit for fair and reproducible evaluation of large, medical vision-language models (VLM). MultiMedEval comprehensively assesses the models' performance on a broad array of six multi-modal tasks, conducted over 23 datasets, and spanning over 11 medical domains. The chosen tasks and performance metrics are based on their widespread adoption in the community and their diversity, ensuring a thorough evaluation of the model's overall generalizability. We open-source a Python toolkit (github.com/corentin-ryr/MultiMedEval) with …

abstract array arxiv benchmark cs.cv datasets domains evaluation fair language language models medical modal multi-modal performance six tasks toolkit type vision vision-language models vlm

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

C003549 Data Analyst (NS) - MON 13 May

@ EMW, Inc. | Braine-l'Alleud, Wallonia, Belgium

Marketing Decision Scientist

@ Meta | Menlo Park, CA | New York City