Web: http://arxiv.org/abs/2202.11474

June 20, 2022, 1:12 a.m. | Shuang Wu, Chi-Hua Wang, Yuantong Li, Guang Cheng

stat.ML updates on arXiv.org arxiv.org

We propose a new bootstrap-based online algorithm for stochastic linear
bandit problems. The key idea is to adopt residual bootstrap exploration, in
which the agent estimates the next step reward by re-sampling the residuals of
mean reward estimate. Our algorithm, residual bootstrap exploration for
stochastic linear bandit (\texttt{LinReBoot}), estimates the linear reward from
its re-sampling distribution and pulls the arm with the highest reward
estimate. In particular, we contribute a theoretical framework to demystify
residual bootstrap-based exploration mechanisms in stochastic …

arxiv exploration linear ml stochastic

More from arxiv.org / stat.ML updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY