http://arxiv.org/abs/2209.08163

Sept. 20, 2022 | Ahmed Sabir, Francesc Moreno-Noguer, Pranava Madhyastha, Lluís Padró

cs.CL updates on arXiv.org arxiv.org

In this work, we focus on improving the captions generated by image-caption
generation systems. We propose a novel re-ranking approach that leverages
visual-semantic measures to identify the ideal caption that maximally captures
the visual information in the image. Our re-ranker utilizes the Belief Revision
framework (Blok et al., 2003) to calibrate the original likelihood of the top-n
captions by explicitly exploiting the semantic relatedness between the depicted
caption and the visual context. Our experiments demonstrate the utility of our
approach, …

arxiv belief information semantic

