Offline Reinforcement Learning with Realizability and Single-policy Concentrability. (arXiv:2202.04634v3 [cs.LG] UPDATED) | allainews.com

June 29, 2022, 1:11 a.m. | Wenhao Zhan, Baihe Huang, Audrey Huang, Nan Jiang, Jason D. Lee

stat.ML updates on arXiv.org arxiv.org

Sample-efficiency guarantees for offline reinforcement learning (RL) often
rely on strong assumptions on both the function classes (e.g.,
Bellman-completeness) and the data coverage (e.g., all-policy concentrability).
Despite the recent efforts on relaxing these assumptions, existing works are
only able to relax one of the two factors, leaving the strong assumption on the
other factor intact. As an important open problem, can we achieve
sample-efficient offline RL with weak assumptions on both factors?

In this paper we answer the question in …

arxiv learning lg policy reinforcement reinforcement learning

More from arxiv.org / stat.ML updates on arXiv.org

Estimation Sample Complexity of a Class of Nonlinear Continuous-time Systems 1 day, 7 hours ago | arxiv.org

abstract arxiv class complexity +14

Estimation and Uniform Inference in Sparse High-Dimensional Additive Models 1 day, 7 hours ago | arxiv.org

abstract arxiv confidence construct +9

GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo 1 day, 7 hours ago | arxiv.org

abstract algorithm arxiv framework +13

Variational Bayesian surrogate modelling with application to robust design optimisation 1 day, 7 hours ago | arxiv.org

abstract application approximation arxiv +20

Corrected generalized cross-validation for finite ensembles of penalized estimators 2 days, 7 hours ago | arxiv.org

abstract arxiv error freedom +13

Statistical Inference for Heterogeneous Treatment Effects Discovered by Generic Machine Learning in Randomized Experiments 2 days, 7 hours ago | arxiv.org

abstract algorithms arxiv causal +15

Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests 2 days, 7 hours ago | arxiv.org

abstract arxiv data distribution +11

Preserving linear invariants in ensemble filtering methods 2 days, 7 hours ago | arxiv.org

abstract arxiv ensemble errors +13

Prediction of flow and elastic stresses in a viscoelastic turbulent channel flow using convolutional neural … 2 days, 7 hours ago | arxiv.org

abstract arxiv convolutional neural networks data +12

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Management Associate

@ EcoVadis | Ebène, Mauritius

View on ai-jobs.net

Senior Data Engineer

@ Telstra | Telstra ICC Bengaluru

View on ai-jobs.net