Sept. 14, 2022, 1:11 a.m. | Varsha Pendyala

cs.LG updates on arXiv.org arxiv.org

In this work I study the problem of adversarial perturbations to rewards, in
a Multi-armed bandit (MAB) setting. Specifically, I focus on an adversarial
attack to a UCB type best-arm identification policy applied to a stochastic
MAB. The UCB attack presented in [1] results in pulling a target arm K very
often. I used the attack model of [1] to derive the sample complexity required
for selecting target arm K as the best arm. I have proved that the stopping …

arm arxiv complexity identification policy

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Integration Specialist

@ Accenture Federal Services | San Antonio, TX

Geospatial Data Engineer - Location Intelligence

@ Allegro | Warsaw, Poland

Site Autonomy Engineer (Onsite)

@ May Mobility | Tokyo, Japan

Summer Intern, AI (Artificial Intelligence)

@ Nextech Systems | Tampa, FL

Permitting Specialist/Wetland Scientist

@ AECOM | Chelmsford, MA, United States