all AI news
FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom
April 19, 2024, 4:42 a.m. | Yuanqin He, Yan Kang, Lixin Fan, Qiang Yang
cs.LG updates on arXiv.org arxiv.org
Abstract: Federated Learning (FL) has emerged as a promising solution for collaborative training of large language models (LLMs). However, the integration of LLMs into FL introduces new challenges, particularly concerning the evaluation of LLMs. Traditional evaluation methods that rely on labeled test sets and similarity-based metrics cover only a subset of the acceptable answers, thereby failing to accurately reflect the performance of LLMs on generative tasks. Meanwhile, although automatic evaluation methods that leverage advanced LLMs present …
abstract arxiv challenges collaborative collective cs.ai cs.cl cs.lg evaluation federated learning however integration language language models large language large language models llm llms solution tasks test training type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Engineer
@ Quantexa | Sydney, New South Wales, Australia
Staff Analytics Engineer
@ Warner Bros. Discovery | NY New York 230 Park Avenue South