all AI news
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL. (arXiv:2206.14057v1 [cs.LG])
June 29, 2022, 1:10 a.m. | Ruiquan Huang, Jing Yang, Yingbin Liang
cs.LG updates on arXiv.org arxiv.org
While the primary goal of the exploration phase in reward-free reinforcement
learning (RF-RL) is to reduce the uncertainty in the estimated model with
minimum number of trajectories, in practice, the agent often needs to abide by
certain safety constraint at the same time. It remains unclear how such safe
exploration requirement would affect the corresponding sample complexity to
achieve the desired optimality of the obtained policy in planning. In this
work, we make a first attempt to answer this question. …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US