Web: http://arxiv.org/abs/2206.11889

June 24, 2022, 1:10 a.m. | Arnob Ghosh, Xingyu Zhou, Ness Shroff

cs.LG updates on arXiv.org arxiv.org

We study the constrained reinforcement learning problem, in which an agent
aims to maximize the expected cumulative reward subject to a constraint on the
expected total value of a utility function. In contrast to existing model-based
approaches or model-free methods accompanied with a `simulator', we aim to
develop the first model-free, simulator-free algorithm that achieves a
sublinear regret and a sublinear constraint violation even in large-scale
systems. To this end, we consider the episodic constrained Markov decision
processes with linear …

approximation arxiv free function lg linear model rl

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY