Web: http://arxiv.org/abs/2206.06847

June 20, 2022, 1:11 a.m. | Yanwen Li, Siyang Gao

cs.LG updates on arXiv.org arxiv.org

The knowledge gradient (KG) algorithm is a popular and effective algorithm
for the best arm identification (BAI) problem. Due to the complex calculation
of KG, theoretical analysis of this algorithm is difficult, and existing
results are mostly about the asymptotic performance of it, e.g., consistency,
asymptotic sample allocation, etc. In this research, we present new theoretical
results about the finite-time performance of the KG algorithm. Under
independent and normally distributed rewards, we derive lower bounds and upper
bounds for the …

algorithm arxiv gradient knowledge ml on performance time

