Feb. 20, 2024, 5:52 a.m. | Christoph Leiter, Hoa Nguyen, Steffen Eger

cs.CL updates on arXiv.org arxiv.org

arXiv:2212.10469v2 Announce Type: replace
Abstract: State-of-the-art natural language generation evaluation metrics are based on black-box language models. Hence, recent works consider their explainability with the goals of better understandability for humans and better metric analysis, including failure cases. In contrast, our proposed method BMX: Boosting Natural Language Generation Metrics with explainability explicitly leverages explanations to boost the metrics' performance. In particular, we perceive feature importance explanations as word-level scores, which we convert, via power means, into a segment-level score. We …

abstract analysis art arxiv boosting box cases contrast cs.cl evaluation evaluation metrics explainability failure humans language language generation language models metrics natural natural language natural language generation state type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Scientist, Commercial Analytics

@ Checkout.com | London, United Kingdom

Data Engineer I

@ Love's Travel Stops | Oklahoma City, OK, US, 73120