April 17, 2024, 4:42 a.m. | Elham J. Barezi, Parisa Kordjamshidi

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.10226v1 Announce Type: cross
Abstract: We analyze knowledge-based visual question answering, for which given a question, the models need to ground it into the visual modality and retrieve the relevant knowledge from a given large knowledge base (KB) to be able to answer. Our analysis has two folds, one based on designing neural architectures and training them from scratch, and another based on large pre-trained language models (LLMs). Our research questions are: 1) Can we effectively augment models by explicit …

abstract analysis analyze arxiv cs.ai cs.cl cs.cv cs.lg gap knowledge knowledge base question question answering reasoning type visual

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York