all AI news
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
March 11, 2024, 4:41 a.m. | Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang
cs.LG updates on arXiv.org arxiv.org
Abstract: Reinforcement learning with human feedback (RLHF) is an emerging paradigm to align models with human preferences. Typically, RLHF aggregates preferences from multiple individuals who have diverse viewpoints that may conflict with each other. Our work \textit{initiates} the theoretical study of multi-party RLHF that explicitly models the diverse preferences of multiple individuals. We show how traditional RLHF approaches can fail since learning a single reward function cannot capture and balance the preferences of multiple individuals. To …
abstract arxiv conflict cs.ai cs.lg diverse feedback human human feedback multiple paradigm reinforcement reinforcement learning rlhf stat.me stat.ml study type work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Associate Data Engineer
@ Nominet | Oxford/ Hybrid, GB
Data Science Senior Associate
@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India