all AI news
Explaining CLIP's performance disparities on data from blind/low vision users
March 26, 2024, 4:49 a.m. | Daniela Massiceti, Camilla Longden, Agnieszka S{\l}owik, Samuel Wills, Martin Grayson, Cecily Morrison
cs.CV updates on arXiv.org arxiv.org
Abstract: Large multi-modal models (LMMs) hold the potential to usher in a new era of automated visual assistance for people who are blind or low vision (BLV). Yet, these models have not been systematically evaluated on data captured by BLV users. We address this by empirically assessing CLIP, a widely-used LMM likely to underpin many assistive technologies. Testing 25 CLIP variants in a zero-shot classification task, we find that their accuracy is 15 percentage points lower …
abstract arxiv automated blind clip cs.cv data lmms low modal multi-modal people performance s performance type vision visual
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US