Web: http://arxiv.org/abs/2112.10751

May 12, 2022, 1:11 a.m. | Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

cs.LG updates on arXiv.org arxiv.org

Recent work has shown that supervised learning alone, without temporal
difference (TD) learning, can be remarkably effective for offline RL. When does
this hold true, and which algorithmic components are necessary? Through
extensive experiments, we boil supervised learning for offline RL down to its
essential elements. In every environment suite we consider, simply maximizing
likelihood with a two-layer feedforward MLP is competitive with
state-of-the-art results of substantially more complex methods based on TD
learning or sequence modeling with Transformers. Carefully …

arxiv learning rl supervised learning

More from arxiv.org / cs.LG updates on arXiv.org

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC