all AI news
When are Lemons Purple? The Concept Association Bias of Vision-Language Models
April 16, 2024, 4:45 a.m. | Yutaro Yamada, Yingtian Tang, Yoyo Zhang, Ilker Yildirim
cs.LG updates on arXiv.org arxiv.org
Abstract: Large-scale vision-language models such as CLIP have shown impressive performance on zero-shot image classification and image-to-text retrieval. However, such performance does not realize in tasks that require a finer-grained correspondence between vision and language, such as Visual Question Answering (VQA). As a potential cause of the difficulty of applying these models to VQA and similar tasks, we report an interesting phenomenon of vision-language models, which we call the Concept Association Bias (CAB). We find that …
abstract arxiv association bias classification clip concept cs.cl cs.cv cs.lg however image image-to-text language language models performance question question answering retrieval scale tasks text type vision vision-language models visual vqa zero-shot
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Sr. VBI Developer II
@ Atos | Texas, US, 75093
Wealth Management - Data Analytics Intern/Co-op Fall 2024
@ Scotiabank | Toronto, ON, CA