all AI news
Variance-Dependent Regret Bounds for Non-stationary Linear Bandits
March 19, 2024, 4:41 a.m. | Zhiyong Wang, Jize Xie, Yi Chen, John C. S. Lui, Dongruo Zhou
cs.LG updates on arXiv.org arxiv.org
Abstract: We investigate the non-stationary stochastic linear bandit problem where the reward distribution evolves each round. Existing algorithms characterize the non-stationarity by the total variation budget $B_K$, which is the summation of the change of the consecutive feature vectors of the linear bandits over $K$ rounds. However, such a quantity only measures the non-stationarity with respect to the expectation of the reward distribution, which makes existing algorithms sub-optimal under the general non-stationary distribution setting. In this …
abstract algorithms arxiv budget change cs.ai cs.lg distribution feature however linear stochastic total type variance variation vectors
More from arxiv.org / cs.LG updates on arXiv.org
The Perception-Robustness Tradeoff in Deterministic Image Restoration
2 days, 14 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne