Sept. 12, 2022, 1:11 a.m. | Yi Shen, Jessilyn Dunn, Michael M. Zavlanos

cs.LG updates on arXiv.org arxiv.org

In this paper, we consider a risk-averse multi-armed bandit (MAB) problem
where the goal is to learn a policy that minimizes the risk of low expected
return, as opposed to maximizing the expected return itself, which is the
objective in the usual approach to risk-neutral MAB. Specifically, we formulate
this problem as a transfer learning problem between an expert and a learner
agent in the presence of contexts that are only observable by the expert but
not by the learner. …

arxiv case case study emotion health mobile multi-armed bandits regulation risk study

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Applied Scientist, Control Stack, AWS Center for Quantum Computing

@ Amazon.com | Pasadena, California, USA

Specialist Marketing with focus on ADAS/AD f/m/d

@ AVL | Graz, AT

Machine Learning Engineer, PhD Intern

@ Instacart | United States - Remote

Supervisor, Breast Imaging, Prostate Center, Ultrasound

@ University Health Network | Toronto, ON, Canada

Senior Manager of Data Science (Recommendation Science)

@ NBCUniversal | New York, NEW YORK, United States