Nov. 24, 2022, 7:14 a.m. | Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin

stat.ML updates on arXiv.org arxiv.org

This paper introduces a simple efficient learning algorithms for general
sequential decision making. The algorithm combines Optimism for exploration
with Maximum Likelihood Estimation for model estimation, which is thus named
OMLE. We prove that OMLE learns the near-optimal policies of an enormously rich
class of sequential decision making problems in a polynomial number of samples.
This rich class includes not only a majority of known tractable model-based
Reinforcement Learning (RL) problems (such as tabular MDPs, factored MDPs, low
witness rank …

algorithm arxiv decision decision making making mle observable

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Engineer

@ Parker | New York City

Sr. Data Analyst | Home Solutions

@ Three Ships | Raleigh or Charlotte, NC