July 13, 2022, 1:11 a.m. | Ilya Osadchiy, Kfir Y. Levy, Ron Meir

cs.LG updates on arXiv.org arxiv.org

We study meta-learning for adversarial multi-armed bandits. We consider the
online-within-online setup, in which a player (learner) encounters a sequence
of multi-armed bandit episodes. The player's performance is measured as regret
against the best arm in each episode, according to the losses generated by an
adversary. The difficulty of the problem depends on the empirical distribution
of the per-episode best arm chosen by the adversary. We present an algorithm
that can leverage the non-uniformity in this empirical distribution, and derive …

arxiv learning lg meta meta-learning multi-armed bandits

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US