all AI news
On the Convergence of Modified Policy Iteration in Risk Sensitive Exponential Cost Markov Decision Processes
Feb. 16, 2024, 5:43 a.m. | Yashaswini Murthy, Mehrdad Moharrami, R. Srikant
cs.LG updates on arXiv.org arxiv.org
Abstract: Modified policy iteration (MPI) is a dynamic programming algorithm that combines elements of policy iteration and value iteration. The convergence of MPI has been well studied in the context of discounted and average-cost MDPs. In this work, we consider the exponential cost risk-sensitive MDP formulation, which is known to provide some robustness to model parameters. Although policy iteration and value iteration have been well studied in the context of risk sensitive MDPs, MPI is unexplored. …
abstract algorithm arxiv context convergence cost cs.ai cs.lg cs.sy decision dynamic eess.sy iteration markov mpi policy processes programming risk type value work
More from arxiv.org / cs.LG updates on arXiv.org
Testing the Segment Anything Model on radiology data
2 days, 1 hour ago |
arxiv.org
Calorimeter shower superresolution
2 days, 1 hour ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US