all AI news
When are Lemons Purple? The Concept Association Bias of Vision-Language Models
April 16, 2024, 4:45 a.m. | Yutaro Yamada, Yingtian Tang, Yoyo Zhang, Ilker Yildirim
cs.LG updates on arXiv.org arxiv.org
Abstract: Large-scale vision-language models such as CLIP have shown impressive performance on zero-shot image classification and image-to-text retrieval. However, such performance does not realize in tasks that require a finer-grained correspondence between vision and language, such as Visual Question Answering (VQA). As a potential cause of the difficulty of applying these models to VQA and similar tasks, we report an interesting phenomenon of vision-language models, which we call the Concept Association Bias (CAB). We find that …
abstract arxiv association bias classification clip concept cs.cl cs.cv cs.lg however image image-to-text language language models performance question question answering retrieval scale tasks text type vision vision-language models visual vqa zero-shot
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US