all AI news
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time
March 26, 2024, 4:41 a.m. | Abhijit Mazumdar, Rafal Wisniewski, Manuela L. Bujorianu
cs.LG updates on arXiv.org arxiv.org
Abstract: In this paper, we present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint. Despite the necessary attention of the scientific community, considering stochastic stopping time, the problem of learning optimal policy without violating safety constraints during the learning phase is yet to be addressed. To this end, we propose an algorithm based on linear programming that does not require a process model. We show that the learned policy is …
abstract algorithm arxiv attention community constraints cs.lg decision markov online reinforcement learning paper policy processes reinforcement reinforcement learning safety scientific stochastic type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Scientist
@ ITE Management | New York City, United States