June 24, 2022, 1:10 a.m. | Arnob Ghosh, Xingyu Zhou, Ness Shroff

cs.LG updates on arXiv.org arxiv.org

We study the constrained reinforcement learning problem, in which an agent
aims to maximize the expected cumulative reward subject to a constraint on the
expected total value of a utility function. In contrast to existing model-based
approaches or model-free methods accompanied with a `simulator', we aim to
develop the first model-free, simulator-free algorithm that achieves a
sublinear regret and a sublinear constraint violation even in large-scale
systems. To this end, we consider the episodic constrained Markov decision
processes with linear …

approximation arxiv free function lg linear rl

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US