all AI news
Learning in Markov Decision Processes under Constraints. (arXiv:2002.12435v5 [cs.LG] UPDATED)
Jan. 7, 2022, 2:10 a.m. | Rahul Singh, Abhishek Gupta, Ness B. Shroff
cs.LG updates on arXiv.org arxiv.org
We consider reinforcement learning (RL) in Markov Decision Processes in which
an agent repeatedly interacts with an environment that is modeled by a
controlled Markov process. At each time step $t$, it earns a reward, and also
incurs a cost-vector consisting of $M$ costs. We design model-based RL
algorithms that maximize the cumulative reward earned over a time horizon of
$T$ time-steps, while simultaneously ensuring that the average values of the
$M$ cost expenditures are bounded by agent-specified thresholds
$c^{ub}_i,i=1,2,\ldots,M$. …
More from arxiv.org / cs.LG updates on arXiv.org
Generalized Schr\"odinger Bridge Matching
1 day, 6 hours ago |
arxiv.org
Tight bounds on Pauli channel learning without entanglement
1 day, 6 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst - Associate
@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India
Staff Data Engineer (Data Platform)
@ Coupang | Seoul, South Korea
AI/ML Engineering Research Internship
@ Keysight Technologies | Santa Rosa, CA, United States
Sr. Director, Head of Data Management and Reporting Execution
@ Biogen | Cambridge, MA, United States
Manager, Marketing - Audience Intelligence (Senior Data Analyst)
@ Delivery Hero | Singapore, Singapore