all AI news
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL. (arXiv:2206.00695v1 [cs.LG])
June 3, 2022, 1:10 a.m. | Wonjoon Goo, Scott Niekum
cs.LG updates on arXiv.org arxiv.org
We introduce an offline reinforcement learning (RL) algorithm that explicitly
clones a behavior policy to constrain value learning. In offline RL, it is
often important to prevent a policy from selecting unobserved actions, since
the consequence of these actions cannot be presumed without additional
information about the environment. One straightforward way to implement such a
constraint is to explicitly model a given data distribution via behavior
cloning and directly force a policy not to select uncertain actions. However,
many offline …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US