all AI news
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
June 27, 2024, 4:42 a.m. | Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang
cs.CL updates on arXiv.org arxiv.org
Abstract: Since the natural language processing (NLP) community started to make large language models (LLMs) act as a critic to evaluate the quality of generated texts, most of the existing works train a critique generation model on the evaluation data labeled by GPT-4's direct prompting. We observe that these models lack the ability to generate informative critiques in both pointwise grading and pairwise comparison especially without references. As a result, their generated critiques cannot provide fine-grained …
abstract act arxiv community critique cs.ai cs.cl data evaluation generated language language model language models language processing large language large language model large language models llms natural natural language natural language processing nlp processing quality replace train type
More from arxiv.org / cs.CL updates on arXiv.org
ReFT: Reasoning with Reinforced Fine-Tuning
1 day, 16 hours ago |
arxiv.org
Exploring Defeasibility in Causal Reasoning
1 day, 16 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Quantitative Researcher – Algorithmic Research
@ Man Group | GB London Riverbank House
Software Engineering Expert
@ Sanofi | Budapest
Senior Bioinformatics Scientist
@ Illumina | US - Bay Area - Foster City
Senior Engineer - Generative AI Product Engineering (Remote-Eligible)
@ Capital One | McLean, VA
Graduate Assistant - Bioinformatics
@ University of Arkansas System | University of Arkansas at Little Rock
Senior AI-HPC Cluster Engineer
@ NVIDIA | US, CA, Santa Clara