First-order Policy Optimization for Robust Markov Decision Process. (arXiv:2209.10579v1 [cs.LG]) | allainews.com

Sept. 23, 2022, 1:11 a.m. | Yan Li, Tuo Zhao, Guanghui Lan

cs.LG updates on arXiv.org arxiv.org

We consider the problem of solving robust Markov decision process (MDP),
which involves a set of discounted, finite state, finite action space MDPs with
uncertain transition kernels. The goal of planning is to find a robust policy
that optimizes the worst-case values against the transition uncertainties, and
thus encompasses the standard MDP planning as a special case. For
$(\mathbf{s},\mathbf{a})$-rectangular uncertainty sets, we develop a
policy-based first-order method, namely the robust policy mirror descent
(RPMD), and establish an $\mathcal{O}(\log(1/\epsilon))$ and
$\mathcal{O}(1/\epsilon)$ …

arxiv decision markov optimization policy process

More from arxiv.org / cs.LG updates on arXiv.org

PPNet: A Two-Stage Neural Network for End-to-end Path Planning 21 hours ago | arxiv.org

abstract arxiv cs.ai cs.lg +14

Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections 21 hours ago | arxiv.org

abstract arxiv cs.ai cs.dc +16

From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks 21 hours ago | arxiv.org

abstract architecture arxiv context +23

DGR: Tackling Drifted and Correlated Noise in Quantum Error Correction via Decoding Graph Re-weighting 21 hours ago | arxiv.org

abstract applications arxiv cs.ar +18

A Single-Loop Algorithm for Decentralized Bilevel Optimization 21 hours ago | arxiv.org

abstract algorithm applications arxiv +13

Watch Out! Simple Horizontal Class Backdoors Can Trivially Evade Defenses 21 hours ago | arxiv.org

abstract arxiv attacks backdoor +13

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples 21 hours ago | arxiv.org

abstract alpha arxiv cs.cr +16

CLEANing Cygnus A deep and fast with R2D2 21 hours ago | arxiv.org

abstract arxiv astronomy astro-ph.im +17

Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image … 21 hours ago | arxiv.org

abstract arxiv biomedical cs.cv +21

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Enterprise AI Architect

@ Oracle | Broomfield, CO, United States

View on ai-jobs.net

Cloud Data Engineer France H/F (CDI - Confirmé)

@ Talan | Nantes, France

View on ai-jobs.net