all AI news
PDF-VQA: A New Dataset for Real-World VQA on PDF Documents. (arXiv:2304.06447v2 [cs.CV] UPDATED)
cs.CV updates on arXiv.org arxiv.org
Document-based Visual Question Answering examines the document understanding
of document images in conditions of natural language questions. We proposed a
new document-based VQA dataset, PDF-VQA, to comprehensively examine the
document understanding from various aspects, including document element
recognition, document layout structural understanding as well as contextual
understanding and key information extraction. Our PDF-VQA dataset extends the
current scale of document understanding that limits on the single document page
to the new scale that asks questions over the full document of …
arxiv dataset documents document understanding extraction images information information extraction language multiple natural natural language pdf question answering questions recognition scale understanding world