all AI news
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
April 16, 2024, 4:51 a.m. | Geyang Guo, Ranchi Zhao, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen
cs.CL updates on arXiv.org arxiv.org
Abstract: Alignment with human preference is a desired property of large language models (LLMs). Currently, the main alignment approach is based on reinforcement learning from human feedback (RLHF). Despite the effectiveness of RLHF, it is intricate to implement and train, thus recent studies explore how to develop alternative alignment approaches based on supervised fine-tuning (SFT). A major limitation of SFT is that it essentially does imitation learning, which cannot fully understand what are the expected behaviors. …
abstract alignment arxiv beyond cs.cl explore feedback fine-grained human human feedback language language models large language large language models llms property quality reinforcement reinforcement learning rlhf studies train type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-
@ JPMorgan Chase & Co. | Wilmington, DE, United States
Senior ML Engineer (Speech/ASR)
@ ObserveAI | Bengaluru