Nov. 5, 2023, 6:42 a.m. | Jaafar Mhamed, Shangding Gu

cs.LG updates on arXiv.org arxiv.org

Incorporating safety is an essential prerequisite for broadening the
practical applications of reinforcement learning in real-world scenarios. To
tackle this challenge, Constrained Markov Decision Processes (CMDPs) are
leveraged, which introduce a distinct cost function representing safety
violations. In CMDPs' settings, Lagrangian relaxation technique has been
employed in previous algorithms to convert constrained optimization problems
into unconstrained dual problems. However, these algorithms may inaccurately
predict unsafe behavior, resulting in instability while learning the Lagrange
multiplier. This study introduces a novel safe …

algorithms applications applications of reinforcement learning arxiv challenge cost decision function markov optimization policy practical processes reinforcement reinforcement learning safety world

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote