all AI news
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code. (arXiv:2206.11249v3 [cs.CL] UPDATED)
June 27, 2022, 1:11 a.m. | Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashi
cs.CL updates on arXiv.org arxiv.org
Evaluation in machine learning is usually informed by past choices, for
example which datasets or metrics to use. This standardization enables the
comparison on equal footing using leaderboards, but the evaluation choices
become sub-optimal as better alternatives arise. This problem is especially
pertinent in natural language generation which requires ever-improving suites
of datasets, metrics, and human evaluation to make definitive claims. To make
following best model evaluation practices easier, we introduce GEMv2. The new
version of the Generation, Evaluation, and …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Program Control Data Analyst
@ Ford Motor Company | Mexico
Vice President, Business Intelligence / Data & Analytics
@ AlphaSense | Remote - United States