March 26, 2024, 4:47 a.m. | Li Zhuowan, Jasani Bhavan, Tang Peng, Ghadar Shabnam

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.16385v1 Announce Type: new
Abstract: Understanding data visualizations like charts and plots requires reasoning about both visual elements and numerics. Although strong in extractive questions, current chart visual question answering (chart VQA) models suffer on complex reasoning questions. In this work, we address the lack of reasoning ability by data augmentation. We leverage Large Language Models (LLMs), which have shown to have strong reasoning ability, as an automatic data annotator that generates question-answer annotations for chart images. The key innovation …

abstract arxiv charts cs.cl cs.cv current data data visualizations generators llms plots question question answering questions reasoning step-by-step tools type understanding visual vqa work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Alternance DATA/AI Engineer (H/F)

@ SQLI | Le Grand-Quevilly, France