Web: http://arxiv.org/abs/2205.14458

Sept. 22, 2022, 1:14 a.m. | Longzhen Yang, Yihang Liu, Yitao Peng, Lianghua He

cs.CV updates on arXiv.org arxiv.org

Accuracy and Diversity are two essential metrizable manifestations in
generating natural and semantically correct captions. Many efforts have been
made to enhance one of them with another decayed due to the trade-off gap. In
this work, we will show that the inferior standard of accuracy draws from human
annotations (leave-one-out) are not appropriate for machine-generated captions.
To improve diversity with a solid accuracy performance, we exploited a novel
Variational Transformer framework. By introducing the "Invisible Information
Prior" and the "Auto-selectable …

accuracy arxiv captioning diversity framework image trade transformer

More from arxiv.org / cs.CV updates on arXiv.org

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Research Engineer - VFX, Neural Compositing

@ Flawless | Los Angeles, California, United States

[Job-TB] Senior Data Engineer

@ CI&T | Brazil

Data Analytics Engineer

@ The Fork | Paris, France