Off-Policy Risk Assessment in Markov Decision Processes. (arXiv:2209.10444v1 [cs.LG]) | allainews.com

Sept. 22, 2022, 1:13 a.m. | Audrey Huang, Liu Leqi, Zachary Chase Lipton, Kamyar Azizzadenesheli

stat.ML updates on arXiv.org arxiv.org

Addressing such diverse ends as safety alignment with human preferences, and
the efficiency of learning, a growing line of reinforcement learning research
focuses on risk functionals that depend on the entire distribution of returns.
Recent work on \emph{off-policy risk assessment} (OPRA) for contextual bandits
introduced consistent estimators for the target policy's CDF of returns along
with finite sample guarantees that extend to (and hold simultaneously over) all
risk. In this paper, we lift OPRA to Markov decision processes (MDPs), where …

arxiv decision markov policy processes risk risk assessment

More from arxiv.org / stat.ML updates on arXiv.org

Nuisance Function Tuning for Optimal Doubly Robust Estimation 1 day, 1 hour ago | arxiv.org

abstract arxiv convergence function +12

Fast Topological Signal Identification and Persistent Cohomological Cycle Matching 1 day, 1 hour ago | arxiv.org

abstract analysis applications art +20

Neural Networks for Extreme Quantile Regression with an Application to Forecasting of Flood Risk 1 day, 1 hour ago | arxiv.org

abstract application arxiv assessment +17

The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms 1 day, 1 hour ago | arxiv.org

abstract algorithms arxiv call +15

Comparison of Point Process Learning and its special case Takacs-Fiksel estimation 1 day, 1 hour ago | arxiv.org

abstract arxiv case comparison +14

Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential … 2 days, 1 hour ago | arxiv.org

abstract ann architectures article +18

Adaptive posterior concentration rates for sparse high-dimensional linear regression with random design and unknown error … 2 days, 1 hour ago | arxiv.org

abstract analyze arxiv design +13

CHANI: Correlation-based Hawkes Aggregation of Neurons with bio-Inspiration 2 days, 1 hour ago | arxiv.org

abstract aggregation arxiv bio +14

Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors 2 days, 1 hour ago | arxiv.org

abstract arxiv bayesian capability +15

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Security Data Engineer

@ ASML | Veldhoven, Building 08, Netherlands

View on ai-jobs.net

Data Engineer

@ Parsons Corporation | Pune - Business Bay

View on ai-jobs.net

Data Engineer

@ Parsons Corporation | Bengaluru, Velankani Tech Park

View on ai-jobs.net