all AI news
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
April 17, 2024, 4:46 a.m. | Robik Shrestha, Kushal Kafle, Christopher Kanan
cs.CL updates on arXiv.org arxiv.org
Abstract: Existing Visual Question Answering (VQA) methods tend to exploit dataset biases and spurious statistical correlations, instead of producing right answers for the right reasons. To address this issue, recent bias mitigation methods for VQA propose to incorporate visual cues (e.g., human attention maps) to better ground the VQA models, showcasing impressive gains. However, we show that the performance improvements are not a result of improved visual grounding, but a regularization effect which prevents over-fitting to …
abstract arxiv attention bias biases correlations cs.ai cs.cl cs.cv dataset exploit human issue maps question question answering statistical type visual visual cues vqa
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Intelligence Manager
@ Sanofi | Budapest
Principal Engineer, Data (Hybrid)
@ Homebase | Toronto, Ontario, Canada