Feb. 20, 2024, 5:51 a.m. | Tejpalsingh Siledar, Swaroop Nath, Sankara Sri Raghava Ravindra Muddu, Rupasai Rangaraju, Swaprava Nath, Pushpak Bhattacharyya, Suman Banerjee, Amey P

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.11683v1 Announce Type: new
Abstract: Evaluation of opinion summaries using conventional reference-based metrics rarely provides a holistic evaluation and has been shown to have a relatively low correlation with human judgments. Recent studies suggest using Large Language Models (LLMs) as reference-free metrics for NLG evaluation, however, they remain unexplored for opinion summary evaluation. Moreover, limited opinion summary evaluation datasets inhibit progress. To address this, we release the SUMMEVAL-OP dataset covering 7 dimensions related to the evaluation of opinion summaries: fluency, …

abstract arxiv correlation cs.cl evaluation free human language language models large language large language models llms low metrics nlg opinion prompt reference studies summary them type

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

Customer Data Analyst with Spanish

@ Michelin | Voluntari

HC Data Analyst - Senior

@ Leidos | 1662 Intelligence Community Campus - Bethesda MD

Healthcare Research & Data Analyst- Infectious, Niche, Rare Disease

@ Clarivate | Remote (121- Massachusetts)

Data Analyst (maternity leave cover)

@ Clarivate | R155-Belgrade

Sales Enablement Data Analyst (Remote)

@ CrowdStrike | USA TX Remote