Sept. 23, 2022, 1:13 a.m. | Sorawit Saengkyongam, Nikolaj Thams, Jonas Peters, Niklas Pfister

stat.ML updates on arXiv.org arxiv.org

Contextual bandit and reinforcement learning algorithms have been
successfully used in various interactive learning systems such as online
advertising, recommender systems, and dynamic pricing. However, they have yet
to be widely adopted in high-stakes application domains, such as healthcare.
One reason may be that existing approaches assume that the underlying
mechanisms are static in the sense that they do not change over different
environments. In many real-world systems, however, the mechanisms are subject
to shifts across environments which may invalidate …

