Feb. 13, 2024, 5:45 a.m. | Prashansa Panda Shalabh Bhatnagar

cs.LG updates on arXiv.org arxiv.org

Actor Critic methods have found immense applications on a wide range of Reinforcement Learning tasks especially when the state-action space is large. In this paper, we consider actor critic and natural actor critic algorithms with function approximation for constrained Markov decision processes (C-MDP) involving inequality constraints and carry out a non-asymptotic analysis for both of these algorithms in a non-i.i.d (Markovian) setting. We consider the long-run average cost criterion where both the objective and the constraint functions are suitable policy-dependent …

actor algorithms analysis applications approximation constraints cs.lg decision found function inequality markov natural paper processes reinforcement reinforcement learning space state tasks

