all AI news
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs. (arXiv:2203.09251v2 [cs.LG] UPDATED)
Web: http://arxiv.org/abs/2203.09251
June 20, 2022, 1:11 a.m. | Andrea Tirinzoni, Aymen Al-Marjani, Emilie Kaufmann
cs.LG updates on arXiv.org arxiv.org
In probably approximately correct (PAC) reinforcement learning (RL), an agent
is required to identify an $\epsilon$-optimal policy with probability
$1-\delta$. While minimax optimal algorithms exist for this problem, its
instance-dependent complexity remains elusive in episodic Markov decision
processes (MDPs). In this paper, we propose the first (nearly) matching upper
and lower bounds on the sample complexity of PAC RL in deterministic episodic
MDPs with finite state and action spaces. In particular, our bounds feature a
new notion of sub-optimality gap …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY