all AI news
Transparent Human Evaluation for Image Captioning. (arXiv:2111.08940v2 [cs.CL] UPDATED)
May 20, 2022, 1:11 a.m. | Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Yejin Choi, Noah A. Smith
cs.CL updates on arXiv.org arxiv.org
We establish THumB, a rubric-based human evaluation protocol for image
captioning models. Our scoring rubrics and their definitions are carefully
developed based on machine- and human-generated captions on the MSCOCO dataset.
Each caption is evaluated along two main dimensions in a tradeoff (precision
and recall) as well as other aspects that measure the text quality (fluency,
conciseness, and inclusive language). Our evaluations demonstrate several
critical problems of the current evaluation practice. Human-generated captions
show substantially higher quality than machine-generated ones, …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Analytics Engineer
@ CircleCI | Remote (US), Remote (Canada), San Francisco, Denver
Bilingual Executive Assistant/Data Analyst - (French and English) - Export
@ Dangote Group | Lagos, Lagos, Nigeria
Workday Services Data Lead
@ WPP | Mexico City, Mexico
Business Data Analyst
@ Nordea | Tallinn, EE, 11415
Data Integrity Lead
@ BioNTech SE | Gaithersburg, MD, US, MD 20878