Prior Constraints-based Reward Model Training for Aligning Large Language Models | allainews.com

April 2, 2024, 7:52 p.m. | Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.00978v1 Announce Type: new
Abstract: Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs.However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model.This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem. PCRM incorporates prior constraints, specifically, length ratio and cosine similarity …

abstract arxiv comparison constraints cs.cl feedback however human human feedback language language models large language large language models llms loss prior ranking reinforcement reinforcement learning reward model scaling training trains type

More from arxiv.org / cs.CL updates on arXiv.org

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback 13 hours ago | arxiv.org

alignment arxiv cs.cl feedback +5

Can Language Model Moderators Improve the Health of Online Discourse? 13 hours ago | arxiv.org

abstract arxiv communities conversational +19

R-Tuning: Instructing Large Language Models to Say `I Don't Know' 13 hours ago | arxiv.org

arxiv cs.cl language language models +3

On-the-Fly Fusion of Large Language Models and Machine Translation 13 hours ago | arxiv.org

abstract arxiv cs.cl data +12

Can LLMs Grade Short-Answer Reading Comprehension Questions : An Empirical Study with a Novel Dataset 13 hours ago | arxiv.org

abstract arxiv assessment cs.ai +16

Making Retrieval-Augmented Language Models Robust to Irrelevant Context 13 hours ago | arxiv.org

abstract arxiv context cs.ai +14

RA-DIT: Retrieval-Augmented Dual Instruction Tuning 13 hours ago | arxiv.org

abstract arxiv build cs.ai +19

Bengali Fake Reviews: A Benchmark Dataset and Detection System 13 hours ago | arxiv.org

abstract arxiv benchmark businesses +16

How far is Language Model from 100% Few-shot Named Entity Recognition in Medical Domain 13 hours ago | arxiv.org

abstract arxiv capabilities cs.cl +14

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net