all AI news
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits. (arXiv:2210.14322v1 [cs.LG])
Oct. 27, 2022, 1:13 a.m. | Thomas Kleine Buening, Aadirupa Saha
stat.ML updates on arXiv.org arxiv.org
We study the problem of non-stationary dueling bandits and provide the first
adaptive dynamic regret algorithm for this problem. The only two existing
attempts in this line of work fall short across multiple dimensions, including
pessimistic measures of non-stationary complexity and non-adaptive parameter
tuning that requires knowledge of the number of preference changes. We develop
an elimination-based rescheduling algorithm to overcome these shortcomings and
show a near-optimal $\tilde{O}(\sqrt{S^{\texttt{CW}} T})$ dynamic regret bound,
where $S^{\texttt{CW}}$ is the number of times the …
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote