Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning. (arXiv:2202.11566v1 [cs.LG]) | allainews.com

Feb. 24, 2022, 2:11 a.m. | Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang

cs.LG updates on arXiv.org arxiv.org

Offline Reinforcement Learning (RL) aims to learn policies from previously
collected datasets without exploring the environment. Directly applying
off-policy algorithms to offline RL usually fails due to the extrapolation
error caused by the out-of-distribution (OOD) actions. Previous methods tackle
such problem by penalizing the Q-values of OOD actions or constraining the
trained policy to be close to the behavior policy. Nevertheless, such methods
typically prevent the generalization of value functions beyond the offline data
and also lack precise characterization of …

arxiv bootstrapping learning reinforcement reinforcement learning uncertainty

More from arxiv.org / cs.LG updates on arXiv.org

REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback an hour ago | arxiv.org

abstract agents arxiv continuous +19

Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection an hour ago | arxiv.org

abstract annotations anomaly anomaly detection +21

Unraveling Batch Normalization for Realistic Test-Time Adaptation an hour ago | arxiv.org

arxiv cs.cv cs.lg normalization +2

The Effective Horizon Explains Deep RL Performance in Stochastic Environments an hour ago | arxiv.org

arxiv cs.ai cs.lg deep rl +6

FM-G-CAM: A Holistic Approach for Explainable AI in Computer Vision an hour ago | arxiv.org

abstract arxiv cnn computer +20

Generating Illustrated Instructions an hour ago | arxiv.org

abstract arxiv cs.ai cs.cv +11

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement an hour ago | arxiv.org

arxiv cs.cv cs.lg dancing +6

SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery an hour ago | arxiv.org

abstract arxiv challenge cs.ai +22

A precise symbolic emulator of the linear matter power spectrum an hour ago | arxiv.org

abstract applications arxiv astro-ph.co +15

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Senior Product Manager - Real-Time Payments Risk AI & Analytics

@ Visa | London, United Kingdom

View on ai-jobs.net

Business Analyst (AI Industry)

@ SmartDev | Cầu Giấy, Vietnam

View on ai-jobs.net

Computer Vision Engineer

@ Sportradar | Mont-Saint-Guibert, Belgium

View on ai-jobs.net

Data Analyst

@ Unissant | Alexandria, VA, USA

View on ai-jobs.net

Senior Applied Scientist

@ Zillow | Remote-USA

View on ai-jobs.net