April 18, 2024, 4:43 a.m. | Anand Siththaranjan, Cassidy Laidlaw, Dylan Hadfield-Menell

stat.ML updates on arXiv.org arxiv.org

arXiv:2312.08358v2 Announce Type: replace-cross
Abstract: In practice, preference learning from human feedback depends on incomplete data with hidden context. Hidden context refers to data that affects the feedback received, but which is not represented in the data used to train a preference model. This captures common issues of data collection, such as having human annotators with varied preferences, cognitive processes that result in seemingly irrational behavior, and combining data labeled according to different criteria. We prove that standard applications of …

accounting arxiv context cs.ai cs.lg hidden rlhf stat.ml type understanding

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Tableau/PowerBI Developer (A.Con)

@ KPMG India | Bengaluru, Karnataka, India

Software Engineer, Backend - Data Platform (Big Data Infra)

@ Benchling | San Francisco, CA