Web: http://arxiv.org/abs/2205.01970

May 5, 2022, 1:10 a.m. | Yueyang Liu, Benjamin Van Roy, Kuang Xu

stat.ML updates on arXiv.org arxiv.org

We propose predictive sampling as an approach to selecting actions that
balance between exploration and exploitation in nonstationary bandit
environments. When specialized to stationary environments, predictive sampling
is equivalent to Thompson sampling. However, predictive sampling is effective
across a range of nonstationary environments in which Thompson sampling
suffers. We establish a general information-theoretic bound on the Bayesian
regret of predictive sampling. We then specialize this bound to study a
modulated Bernoulli bandit environment. Our analysis highlights a key advantage
of …

arxiv learning predictive sampling

More from arxiv.org / stat.ML updates on arXiv.org

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote

Director of AI/ML Engineering

@ Armis Industries | Remote (US only), St. Louis, California

Digital Analytics Manager

@ Patagonia | Ventura, California