Feb. 26, 2024, 5:42 a.m. | Julien ZhouThoth, STATIFY, Pierre GaillardThoth, Thibaud RahierSODA, PREMEDICAL, Houssam ZenatiSODA, PREMEDICAL, Julyan ArbelSTATIFY

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.15171v1 Announce Type: new
Abstract: We address the problem of stochastic combinatorial semi-bandits, where a player can select from P subsets of a set containing d base items. Most existing algorithms (e.g. CUCB, ESCB, OLS-UCB) require prior knowledge on the reward distribution, like an upper bound on a sub-Gaussian proxy-variance, which is hard to estimate tightly. In this work, we design a variance-adaptive version of OLS-UCB, relying on an online estimation of the covariance structure. Estimating the coefficients of a …

abstract algorithm algorithms arxiv covariance cs.lg distribution knowledge least math.st ols prior set squares stat.ml stat.th stochastic type variance

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne