Feb. 19, 2024, 5:42 a.m. | Le Bronnec Florian, Verine Alexandre, Negrevergne Benjamin, Chevaleyre Yann, Allauzen Alexandre

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.10693v1 Announce Type: cross
Abstract: This paper introduces a novel evaluation framework for Large Language Models (LLMs) such as Llama-2 and Mistral, focusing on the adaptation of Precision and Recall metrics from image generation to text generation. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora. By conducting a comprehensive evaluation of state-of-the-art language models, the study reveals significant insights into their performance on open-ended generation tasks, which …

abstract arxiv assessment cs.cl cs.lg diversity evaluation framework image image generation language language models large language large language models llama llms metrics mistral novel paper precision quality recall text text generation type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Robotics Technician - 3rd Shift

@ GXO Logistics | Perris, CA, US, 92571