all AI news
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
April 26, 2024, 4:41 a.m. | Bram De Cooman, Johan Suykens
cs.LG updates on arXiv.org arxiv.org
Abstract: Model-free reinforcement learning methods lack an inherent mechanism to impose behavioural constraints on the trained policies. While certain extensions exist, they remain limited to specific types of constraints, such as value constraints with additional reward signals or visitation density constraints. In this work we try to unify these existing techniques and bridge the gap with classical optimization and control theory, using a generic primal-dual framework for value-based and actor-critic reinforcement learning methods. The obtained dual …
abstract arxiv constraints cs.ai cs.lg cs.sy eess.sy extensions free perspective policies policy reinforcement reinforcement learning type types value while work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Sr. Software Development Manager, AWS Neuron Machine Learning Distributed Training
@ Amazon.com | Cupertino, California, USA