Feb. 14, 2024, 5:43 a.m. | Sidharth Mudgal Jong Lee Harish Ganapathy YaGuang Li Tao Wang Yanping Huang Zhifeng Chen Heng-

cs.LG updates on arXiv.org arxiv.org

KL-regularized reinforcement learning (RL) is a popular alignment framework to control the language model responses towards high reward outcomes. We propose a modular solver for this RL objective, called controlled decoding (CD), which exerts control through a separate prefix scorer module. At training time, the prefix scorer learns a value function for the reward, and it is used at inference time to control the generation from a frozen base model, provably sampling from a solution to the RL objective. We …

alignment control cs.ai cs.cl cs.lg decoding framework function language language model language models modular popular reinforcement reinforcement learning responses solver through training value

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote