Web: http://arxiv.org/abs/2201.10070

Jan. 26, 2022, 2:11 a.m. | Yihuan Mao, Chao Wang, Bin Wang, Chongjie Zhang

cs.LG updates on arXiv.org arxiv.org

With the success of offline reinforcement learning (RL), offline trained RL
policies have the potential to be further improved when deployed online. A
smooth transfer of the policy matters in safe real-world deployment. Besides,
fast adaptation of the policy plays a vital role in practical online
performance improvement. To tackle these challenges, we propose a simple yet
efficient algorithm, Model-based Offline-to-Online Reinforcement learning
(MOORe), which employs a prioritized sampling scheme that can dynamically
adjust the offline and online data for …

arxiv learning model online reinforcement learning

Engineering Manager, Machine Learning (Credit Engineering)

@ Affirm | Remote Poland

Sr Data Engineer

@ Rappi | [CO] Bogotá

Senior Analytics Engineer

@ GetGround | Porto

Senior Staff Software Engineer, Data Engineering

@ Galileo, Inc. | New York City or Remote

Data Engineer

@ Atlassian | Bengaluru, India

Data Engineer | Hybrid (Pune)

@ Velotio | Pune, Maharashtra, India