Dynamic Regret of Online Markov Decision Processes. (arXiv:2208.12483v1 [cs.LG]) | allainews.com

Aug. 29, 2022, 1:12 a.m. | Peng Zhao, Long-Fei Li, Zhi-Hua Zhou

stat.ML updates on arXiv.org arxiv.org

We investigate online Markov Decision Processes (MDPs) with adversarially
changing loss functions and known transitions. We choose dynamic regret as the
performance measure, defined as the performance difference between the learner
and any sequence of feasible changing policies. The measure is strictly
stronger than the standard static regret that benchmarks the learner's
performance with a fixed compared policy. We consider three foundational models
of online MDPs, including episodic loop-free Stochastic Shortest Path (SSP),
episodic SSP, and infinite-horizon MDPs. For these …

arxiv decision lg markov processes

More from arxiv.org / stat.ML updates on arXiv.org

Entropic covariance models 10 hours ago | arxiv.org

abstract arxiv challenges covariance +12

Bump hunting through density curvature features 10 hours ago | arxiv.org

abstract arxiv construct data +18

Uncertainty quantification in metric spaces 10 hours ago | arxiv.org

abstract algorithms arxiv datasets +15

Guiding adaptive shrinkage by co-data to improve regression-based prediction and feature selection 10 hours ago | arxiv.org

abstract arxiv clinical data +17

A general error analysis for randomized low-rank approximation with application to data assimilation 10 hours ago | arxiv.org

abstract algebra algorithms analysis +17

Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation 1 day, 10 hours ago | arxiv.org

abstract approximation arxiv five +17

Bayesian Quantile Regression with Subset Selection: A Posterior Summarization Perspective 1 day, 10 hours ago | arxiv.org

abstract arxiv bayesian distribution +16

The Projected Covariance Measure for assumption-lean variable significance testing 1 day, 10 hours ago | arxiv.org

abstract arxiv covariance lean +14

A Heteroskedasticity-Robust Overidentifying Restriction Test with High-Dimensional Covariates 1 day, 10 hours ago | arxiv.org

abstract arxiv econ.em errors +11

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net