April 26, 2024, 4:41 a.m. | Bram De Cooman, Johan Suykens

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.16468v1 Announce Type: new
Abstract: Model-free reinforcement learning methods lack an inherent mechanism to impose behavioural constraints on the trained policies. While certain extensions exist, they remain limited to specific types of constraints, such as value constraints with additional reward signals or visitation density constraints. In this work we try to unify these existing techniques and bridge the gap with classical optimization and control theory, using a generic primal-dual framework for value-based and actor-critic reinforcement learning methods. The obtained dual …

abstract arxiv constraints cs.ai cs.lg cs.sy eess.sy extensions free perspective policies policy reinforcement reinforcement learning type types value while work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Sr. Software Development Manager, AWS Neuron Machine Learning Distributed Training

@ Amazon.com | Cupertino, California, USA