all AI news
Residual Bootstrap Exploration for Stochastic Linear Bandit. (arXiv:2202.11474v2 [stat.ML] UPDATED)
Web: http://arxiv.org/abs/2202.11474
June 20, 2022, 1:12 a.m. | Shuang Wu, Chi-Hua Wang, Yuantong Li, Guang Cheng
stat.ML updates on arXiv.org arxiv.org
We propose a new bootstrap-based online algorithm for stochastic linear
bandit problems. The key idea is to adopt residual bootstrap exploration, in
which the agent estimates the next step reward by re-sampling the residuals of
mean reward estimate. Our algorithm, residual bootstrap exploration for
stochastic linear bandit (\texttt{LinReBoot}), estimates the linear reward from
its re-sampling distribution and pulls the arm with the highest reward
estimate. In particular, we contribute a theoretical framework to demystify
residual bootstrap-based exploration mechanisms in stochastic …
More from arxiv.org / stat.ML updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY