June 18, 2024, 4:43 a.m. | Amith Ananthram, Elias Stengel-Eskin, Carl Vondrick, Mohit Bansal, Kathleen McKeown

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.11665v1 Announce Type: new
Abstract: Vision-language models (VLMs) can respond to queries about images in many languages. However, beyond language, culture affects how we see things. For example, individuals from Western cultures focus more on the central figure in an image while individuals from Eastern cultures attend more to scene context. In this work, we present a novel investigation that demonstrates and localizes VLMs' Western bias in image understanding. We evaluate large VLMs across subjective and objective visual tasks with …

abstract arxiv beyond bias cs.ai cs.cl cs.cv culture example figure focus however image images language language models languages perspective queries things type understanding vision vision-language vision-language models vlms while

Senior Clinical Data Scientist

@ Novartis | Home Worker

R&D Senior Data Scientist 1

@ Jotun | Sandefjord

Data Scientist - Corporate Audit, Officer

@ State Street | Toronto, Ontario

Senior Manager, Data Science & Analytics Solutions - Safety

@ Hyundai Motor America | Fountain Valley, CA, US, 92708

Data Science Working Student (all genders)

@ Merck Group | Darmstadt, Hessen, DE, 64293

Senior Data Scientist (m/f/d)

@ BASF | Limburgerhof, DE