Web: http://arxiv.org/abs/2112.10728

May 5, 2022, 1:10 a.m. | Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwin

cs.CV updates on arXiv.org arxiv.org

Recently, there has been an increasing interest in building question
answering (QA) models that reason across multiple modalities, such as text and
images. However, QA using images is often limited to just picking the answer
from a pre-defined set of options. In addition, images in the real world,
especially in news, have objects that are co-referential to the text, with
complementary information from both modalities. In this paper, we present a new
QA evaluation benchmark with 1,384 questions over news …

arxiv cross extraction knowledge media news question answering

More from arxiv.org / cs.CV updates on arXiv.org

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC