April 22, 2024, 4:45 a.m. | Longfei Huang, Shupeng Zhong, Xiangyu Wu, Ruoxuan Li, Qingguo Chen, Yang Yang

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.12739v1 Announce Type: new
Abstract: This report introduces a solution to the Topic 1 Zero-shot Image Captioning of 2024 NICE : New frontiers for zero-shot Image Captioning Evaluation. In contrast to NICE 2023 datasets, this challenge involves new annotations by humans with significant differences in caption style and content. Therefore, we enhance image captions effectively through retrieval augmentation and caption grading methods. At the data level, we utilize high-quality captions generated by image caption models as training data to address …

abstract annotations arxiv captioning challenge contrast cs.cv datasets differences evaluation frontiers humans image nice report solution style type zero-shot

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Machine Learning Engineer

@ Apple | Sunnyvale, California, United States