all AI news
Boter: Bootstrapping Knowledge Selection and Question Answering for Knowledge-based VQA
April 23, 2024, 4:47 a.m. | Dongze Hao, Qunbo Wang, Longteng Guo, Jie Jiang, Jing Liu
cs.CV updates on arXiv.org arxiv.org
Abstract: Knowledge-based Visual Question Answering (VQA) requires models to incorporate external knowledge to respond to questions about visual content. Previous methods mostly follow the "retrieve and generate" paradigm. Initially, they utilize a pre-trained retriever to fetch relevant knowledge documents, subsequently employing them to generate answers. While these methods have demonstrated commendable performance in the task, they possess limitations: (1) they employ an independent retriever to acquire knowledge solely based on the similarity between the query and …
abstract arxiv bootstrapping cs.cv documents fetch generate knowledge paradigm question question answering questions retriever them type visual vqa
More from arxiv.org / cs.CV updates on arXiv.org
Compact 3D Scene Representation via Self-Organizing Gaussian Grids
1 day, 9 hours ago |
arxiv.org
Fingerprint Matching with Localized Deep Representation
1 day, 9 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne