all AI news
Towards Complex Document Understanding By Discrete Reasoning. (arXiv:2207.11871v2 [cs.CV] UPDATED)
Sept. 8, 2022, 1:14 a.m. | Fengbin Zhu, Wenqiang Lei, Fuli Feng, Chao Wang, Haozhou Zhang, Tat-Seng Chua
cs.CV updates on arXiv.org arxiv.org
Document Visual Question Answering (VQA) aims to understand visually-rich
documents to answer questions in natural language, which is an emerging
research topic for both Natural Language Processing and Computer Vision. In
this work, we introduce a new Document VQA dataset, named TAT-DQA, which
consists of 3,067 document pages comprising semi-structured table(s) and
unstructured text as well as 16,558 question-answer pairs by extending the
TAT-QA dataset. These documents are sampled from real-world financial reports
and contain lots of numbers, which means …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Consultant Senior Power BI & Azure - CDI - H/F
@ Talan | Lyon, France