Sept. 22, 2022, 1:12 a.m. | Longzhen Yang, Yihang Liu, Yitao Peng, Lianghua He

cs.LG updates on arXiv.org arxiv.org

Accuracy and Diversity are two essential metrizable manifestations in
generating natural and semantically correct captions. Many efforts have been
made to enhance one of them with another decayed due to the trade-off gap. In
this work, we will show that the inferior standard of accuracy draws from human
annotations (leave-one-out) are not appropriate for machine-generated captions.
To improve diversity with a solid accuracy performance, we exploited a novel
Variational Transformer framework. By introducing the "Invisible Information
Prior" and the "Auto-selectable …

accuracy arxiv captioning diversity framework image trade transformer

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Business Intelligence Developer / Analyst

@ Transamerica | Work From Home, USA

Data Analyst (All Levels)

@ Noblis | Bethesda, MD, United States