all AI news
Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF
April 18, 2024, 4:43 a.m. | Anand Siththaranjan, Cassidy Laidlaw, Dylan Hadfield-Menell
stat.ML updates on arXiv.org arxiv.org
Abstract: In practice, preference learning from human feedback depends on incomplete data with hidden context. Hidden context refers to data that affects the feedback received, but which is not represented in the data used to train a preference model. This captures common issues of data collection, such as having human annotators with varied preferences, cognitive processes that result in seemingly irrational behavior, and combining data labeled according to different criteria. We prove that standard applications of …
accounting arxiv context cs.ai cs.lg hidden rlhf stat.ml type understanding
More from arxiv.org / stat.ML updates on arXiv.org
Uniform Inference for Subsampled Moment Regression
1 day, 15 hours ago |
arxiv.org
Partial information decomposition as information bottleneck
1 day, 15 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York