Web: http://arxiv.org/abs/2201.10654

Jan. 27, 2022, 2:10 a.m. | Peixi Xiong, Quanzeng You, Pei Yu, Zicheng Liu, Ying Wu

cs.CV updates on arXiv.org arxiv.org

Visual Question Answering (VQA) attracts much attention from both industry
and academia. As a multi-modality task, it is challenging since it requires not
only visual and textual understanding, but also the ability to align
cross-modality representations. Previous approaches extensively employ
entity-level alignments, such as the correlations between the visual regions
and their semantic labels, or the interactions across question words and object
features. These attempts aim to improve the cross-modality representations,
while ignoring their internal relations. Instead, we propose to …

arxiv cv question answering semantic

Director, Data Science (Advocacy & Nonprofit)

@ Civis Analytics | Remote

Data Engineer

@ Rappi | [CO] Bogotá

Data Scientist V, Marketplaces Personalization (Remote)

@ ID.me | United States (U.S.)

Product OPs Data Analyst (Flex/Remote)

@ Scaleway | Paris

Big Data Engineer

@ Risk Focus | Riga, Riga, Latvia

Internship Program: Machine Learning Backend

@ Nextail | Remote job