all AI news
Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System
April 16, 2024, 4:51 a.m. | Chang Tian, Wenpeng Yin, Marie-Francine Moens
cs.CL updates on arXiv.org arxiv.org
Abstract: A dialogue policy module is an essential part of task-completion dialogue systems. Recently, increasing interest has focused on reinforcement learning (RL)-based dialogue policy. Its favorable performance and wise action decisions rely on an accurate estimation of action values. The overestimation problem is a widely known issue of RL since its estimate of the maximum action value is larger than the ground truth, which results in an unstable learning process and suboptimal policy. This problem is …
abstract arxiv cs.ai cs.cl decisions dialogue issue part performance policy reinforcement reinforcement learning systems type values wise
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Scientist
@ ITE Management | New York City, United States