Offline RL Policies Should be Trained to be Adaptive. (arXiv:2207.02200v1 [cs.LG]) | allainews.com

July 6, 2022, 1:11 a.m. | Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

stat.ML updates on arXiv.org arxiv.org

Offline RL algorithms must account for the fact that the dataset they are
provided may leave many facets of the environment unknown. The most common way
to approach this challenge is to employ pessimistic or conservative methods,
which avoid behaviors that are too dissimilar from those in the training
dataset. However, relying exclusively on conservatism has drawbacks:
performance is sensitive to the exact degree of conservatism, and conservative
objectives can recover highly suboptimal policies. In this work, we propose
that …

More from arxiv.org / stat.ML updates on arXiv.org

Non-asymptotic estimates for accelerated high order Langevin Monte Carlo algorithms 12 hours ago | arxiv.org

abstract algorithms arxiv convergence +9

Entropic covariance models 1 day, 12 hours ago | arxiv.org

abstract arxiv challenges covariance +12

Bump hunting through density curvature features 1 day, 12 hours ago | arxiv.org

abstract arxiv construct data +18

Uncertainty quantification in metric spaces 1 day, 12 hours ago | arxiv.org

abstract algorithms arxiv datasets +15

Guiding adaptive shrinkage by co-data to improve regression-based prediction and feature selection 1 day, 12 hours ago | arxiv.org

abstract arxiv clinical data +17

A general error analysis for randomized low-rank approximation with application to data assimilation 1 day, 12 hours ago | arxiv.org

abstract algebra algorithms analysis +17

Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation 2 days, 12 hours ago | arxiv.org

abstract approximation arxiv five +17

Bayesian Quantile Regression with Subset Selection: A Posterior Summarization Perspective 2 days, 12 hours ago | arxiv.org

abstract arxiv bayesian distribution +16

The Projected Covariance Measure for assumption-lean variable significance testing 2 days, 12 hours ago | arxiv.org

abstract arxiv covariance lean +14

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net