all AI news
Policy Gradient Methods in the Presence of Symmetries and State Abstractions
March 8, 2024, 5:42 a.m. | Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup
cs.LG updates on arXiv.org arxiv.org
Abstract: Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In this paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision process (MDP) homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy …
abstract abstraction abstractions arxiv continuous control cs.ai cs.lg decision definition efficiency gradient markov paper policy process reinforcement reinforcement learning state study type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Business Intelligence Architect - Specialist
@ Eastman | Hyderabad, IN, 500 008