all AI news
Restless Multi-Armed Bandits under Exogenous Global Markov Process. (arXiv:2202.13665v2 [cs.LG] UPDATED)
Oct. 11, 2022, 1:13 a.m. | Tomer Gafni, Michal Yemini, Kobi Cohen
cs.LG updates on arXiv.org arxiv.org
We consider an extension to the restless multi-armed bandit (RMAB) problem
with unknown arm dynamics, where an unknown exogenous global Markov process
governs the rewards distribution of each arm. Under each global state, the
rewards process of each arm evolves according to an unknown Markovian rule,
which is non-identical among different arms. At each time, a player chooses an
arm out of N arms to play, and receives a random reward from a finite set of
reward states. The arms …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead Data Engineer
@ WorkMoney | New York City, United States - Remote