all AI news
Risk Aversion In Learning Algorithms and an Application To Recommendation Systems. (arXiv:2205.04619v1 [cs.LG])
cs.LG updates on arXiv.org arxiv.org
Consider a bandit learning environment. We demonstrate that popular learning
algorithms such as Upper Confidence Band (UCB) and $\varepsilon$-Greedy exhibit
risk aversion: when presented with two arms of the same expectation, but
different variance, the algorithms tend to not choose the riskier, i.e. higher
variance, arm. We prove that $\varepsilon$-Greedy chooses the risky arm with
probability tending to $0$ when faced with a deterministic and a
Rademacher-distributed arm. We show experimentally that UCB also shows
risk-averse behavior, and that risk …
algorithms application arxiv learning recommendation recommendation systems risk systems