Web: http://arxiv.org/abs/2206.07989

June 17, 2022, 1:10 a.m. | Jiafei Lyu, Xiu Li, Zongqing Lu

cs.LG updates on arXiv.org arxiv.org

The learned policy of model-free offline reinforcement learning (RL) methods
is often constrained to stay within the support of datasets to avoid possible
dangerous out-of-distribution actions or states, making it challenging to
handle out-of-support region. Model-based RL methods offer a richer dataset and
benefit generalization by generating imaginary trajectories with either trained
forward or reverse dynamics model. However, the imagined transitions may be
inaccurate, thus downgrading the performance of the underlying offline RL
method. In this paper, we propose to …

arxiv confidence lg model state

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY