all AI news
Geometric Policy Iteration for Markov Decision Processes. (arXiv:2206.05809v2 [cs.LG] UPDATED)
June 27, 2022, 1:11 a.m. | Yue Wu, Jesús A. De Loera
cs.LG updates on arXiv.org arxiv.org
Recently discovered polyhedral structures of the value function for finite
state-action discounted Markov decision processes (MDP) shed light on
understanding the success of reinforcement learning. We investigate the value
function polytope in greater detail and characterize the polytope boundary
using a hyperplane arrangement. We further show that the value space is a union
of finitely many cells of the same hyperplane arrangement and relate it to the
polytope of the classical linear programming formulation for MDPs. Inspired by
these geometric …
More from arxiv.org / cs.LG updates on arXiv.org
A Single-Loop Algorithm for Decentralized Bilevel Optimization
1 day, 9 hours ago |
arxiv.org
CLEANing Cygnus A deep and fast with R2D2
1 day, 9 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Management Associate
@ EcoVadis | Ebène, Mauritius
Senior Data Engineer
@ Telstra | Telstra ICC Bengaluru