June 4, 2024, 4:54 a.m. | Zhumin Chu, Qingyao Ai, Yiteng Tu, Haitao Li, Yiqun Liu

cs.CL updates on arXiv.org arxiv.org

arXiv:2401.15641v2 Announce Type: replace-cross
Abstract: The impressive performance of large language models (LLMs) has attracted considerable attention from the academic and industrial communities. Besides how to construct and train LLMs, how to effectively evaluate and compare the capacity of LLMs has also been well recognized as an important yet difficult problem. Existing paradigms rely on either human annotators or model-based evaluators to evaluate the performance of LLMs on different tasks. However, these paradigms often suffer from high cost, low generalizability, …

abstract academic arxiv attention capacity communities construct cs.cl cs.ir industrial language language model language models large language large language model large language models llms peer performance problem replace review train type

Senior Data Engineer

@ Displate | Warsaw

Principal Architect

@ eSimplicity | Silver Spring, MD, US

Embedded Software Engineer

@ Carrier | CAN03: Carrier-Charlotte, NC 9701 Old Statesville Road, Charlotte, NC, 28269 USA

(USA) Software Engineer III

@ Roswell Park Comprehensive Cancer Center | (USA) CA SUNNYVALE Home Office SUNNYVALE III - 840 W CALIFORNIA

Experienced Manufacturing and Automation Engineer

@ Boeing | DEU - Munich, Germany

Software Engineering-Sr Engineer (Java 17, Python, Microservices, Spring Boot, REST)

@ FICO | Bengaluru, India