all AI news
All You May Need for VQA are Image Captions. (arXiv:2205.01883v1 [cs.CV])
Web: http://arxiv.org/abs/2205.01883
May 5, 2022, 1:11 a.m. | Soravit Changpinyo, Doron Kukliansky, Idan Szpektor, Xi Chen, Nan Ding, Radu Soricut
cs.CL updates on arXiv.org arxiv.org
Visual Question Answering (VQA) has benefited from increasingly sophisticated
models, but has not enjoyed the same level of engagement in terms of data
creation. In this paper, we propose a method that automatically derives VQA
examples at volume, by leveraging the abundance of existing image-caption
annotations combined with neural models for textual question generation. We
show that the resulting data is of high-quality. VQA models trained on our data
improve state-of-the-art zero-shot accuracy by double digits and achieve a
level …
More from arxiv.org / cs.CL updates on arXiv.org
Latest AI/ML/Big Data Jobs
Predictive Ecology Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Data Analyst, Patagonia Action Works
@ Patagonia | Remote
Data & Insights Strategy & Innovation General Manager
@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX
Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis
@ Ahmedabad University | Ahmedabad, India
Director, Applied Mathematics & Computational Research Division
@ Lawrence Berkeley National Lab | Berkeley, Ca
Business Data Analyst
@ MainStreet Family Care | Birmingham, AL