all AI news
Dynamic Memory for Interpretable Sequential Optimisation. (arXiv:2206.13960v1 [cs.LG])
June 29, 2022, 1:11 a.m. | Srivas Chennu, Andrew Maher, Jamie Martin, Subash Prabanantham
stat.ML updates on arXiv.org arxiv.org
Real-world applications of reinforcement learning for recommendation and
experimentation faces a practical challenge: the relative reward of different
bandit arms can evolve over the lifetime of the learning agent. To deal with
these non-stationary cases, the agent must forget some historical knowledge, as
it may no longer be relevant to minimise regret. We present a solution to
handling non-stationarity that is suitable for deployment at scale, to provide
business operators with automated adaptive optimisation. Our solution aims to
provide interpretable …
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote