all AI news
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Feb. 19, 2024, 5:42 a.m. | Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang
cs.LG updates on arXiv.org arxiv.org
Abstract: We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, including (1) handling the large state space, (2) managing the exploration/exploitation tradeoff, and (3) solving the constrained optimization where the objective and the constraint are both nonlinear functions of the visitation measure. In this work, we present a …
abstract algorithms arxiv challenges cs.lg decision designing functional markov math.oc optimization policy primal process reinforcement reinforcement learning space state stat.ml study type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US