Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning. (arXiv:2210.13846v1 [cs.LG]) | allainews.com

Oct. 26, 2022, 1:11 a.m. | Yi Zhao, Rinu Boney, Alexander Ilin, Juho Kannala, Joni Pajarinen

cs.LG updates on arXiv.org arxiv.org

Offline reinforcement learning, by learning from a fixed dataset, makes it
possible to learn agent behaviors without interacting with the environment.
However, depending on the quality of the offline dataset, such pre-trained
agents may have limited performance and would further need to be fine-tuned
online by interacting with the environment. During online fine-tuning, the
performance of the pre-trained agent may collapse quickly due to the sudden
distribution shift from offline to online data. While constraints enforced by
offline RL methods …

arxiv behavior cloning offline online reinforcement learning regularization reinforcement reinforcement learning

More from arxiv.org / cs.LG updates on arXiv.org

Learning to Manipulate under Limited Information 19 hours ago | arxiv.org

abstract arxiv become cs.ai +13

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction … 19 hours ago | arxiv.org

abstract alignment arxiv cs.ai +17

Evolutionary Optimization of 1D-CNN for Non-contact Respiration Pattern Classification 19 hours ago | arxiv.org

abstract arxiv classification cnn +17

Regularization by Texts for Latent Diffusion Inverse Solvers 19 hours ago | arxiv.org

abstract arxiv challenges cs.ai +10

A Systematic Review of Aspect-based Sentiment Analysis (ABSA): Domains, Methods, and Trends 19 hours ago | arxiv.org

abstract analysis arxiv cs.cl +13

Fossil 2.0: Formal Certificate Synthesis for the Verification and Control of Dynamical Models 19 hours ago | arxiv.org

abstract arxiv control cs.lg +16

In-Context Learning Dynamics with Random Binary Sequences 19 hours ago | arxiv.org

abstract art arxiv binary +24

Sharp error bounds for imbalanced classification: how many examples in the minority class? 19 hours ago | arxiv.org

abstract arxiv class classification +15

When can transformers reason with abstract symbols? 19 hours ago | arxiv.org

abstract arxiv capabilities cs.ai +19

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

(Fluent Ukrainian) ML Engineer

@ Outstaff Your Team | Warsaw, Masovian Voivodeship, Poland - Remote

View on ai-jobs.net

Senior Back-end Engineer (Cargo Models)

@ Kpler | London

View on ai-jobs.net

Senior Data Science Manager, Marketplace Foundations

@ Reddit | Remote - United States

View on ai-jobs.net

Intermediate Data Engineer

@ JUMO | South Africa

View on ai-jobs.net

Data Engineer ( remote )

@ AssistRx | Orlando, Florida, United States - Remote

View on ai-jobs.net