June 10, 2024, 4:42 a.m. | Sidi Lu, Wenbo Zhao, Chenyang Tao, Arpit Gupta, Shanchan Wu, Tagyoung Chung, Nanyun Peng

cs.CL updates on arXiv.org arxiv.org

arXiv:2306.11825v2 Announce Type: replace
Abstract: NeurAlly-Decomposed Oracle (NADO) is a powerful approach for controllable generation with large language models. It is designed to avoid catastrophic forgetting while achieving guaranteed convergence to an entropy-maximized closed-form optimal solution with reasonable modeling capacity. Despite the success, several challenges arise when apply NADO to a wide range of scenarios. Vanilla NADO suffers from gradient vanishing for low-probability control signals and is highly reliant on a regularization to satisfy the stochastic version of Bellman equation. …

