all AI news
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code. (arXiv:2206.11249v2 [cs.CL] UPDATED)
June 24, 2022, 1:12 a.m. | Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashi
cs.CL updates on arXiv.org arxiv.org
Evaluation in machine learning is usually informed by past choices, for
example which datasets or metrics to use. This standardization enables the
comparison on equal footing using leaderboards, but the evaluation choices
become sub-optimal as better alternatives arise. This problem is especially
pertinent in natural language generation which requires ever-improving suites
of datasets, metrics, and human evaluation to make definitive claims. To make
following best model evaluation practices easier, we introduce GEMv2. The new
version of the Generation, Evaluation, and …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Lead Data Engineer
@ JPMorgan Chase & Co. | Jersey City, NJ, United States
Senior Machine Learning Engineer
@ TELUS | Vancouver, BC, CA
CT Technologist - Ambulatory Imaging - PRN
@ Duke University | Morriville, NC, US, 27560
BH Data Analyst
@ City of Philadelphia | Philadelphia, PA, United States