all AI news
Distinctive Image Captioning via CLIP Guided Group Optimization. (arXiv:2208.04254v3 [cs.CV] UPDATED)
Aug. 12, 2022, 1:12 a.m. | Youyuan Zhang, Jiuniu Wang, Hao Wu, Wenjia Xu
cs.CV updates on arXiv.org arxiv.org
Image captioning models are usually trained according to human annotated
ground-truth captions, which could generate accurate but generic captions. In
this paper, we focus on generating the distinctive captions that can
distinguish the target image from other similar images. To evaluate the
distinctiveness of captions, we introduce a series of metrics that use
large-scale vision-language pre-training model CLIP to quantify the
distinctiveness. To further improve the distinctiveness of captioning models,
we propose a simple and effective training strategy which trains …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ SEAKR Engineering | Englewood, CO, United States
Data Analyst II
@ Postman | Bengaluru, India
Data Architect
@ FORSEVEN | Warwick, GB
Director, Data Science
@ Visa | Washington, DC, United States
Senior Manager, Data Science - Emerging ML
@ Capital One | McLean, VA