Policy Gradient Methods in the Presence of Symmetries and State Abstractions | allainews.com

Jan. 1, 2024, midnight | Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup

JMLR www.jmlr.org

Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In this paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision process (MDP) homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization. Based on these theorems, …

abstraction abstractions continuous control decision definition efficiency gradient markov paper policy process reinforcement reinforcement learning spaces state study

More from www.jmlr.org / JMLR

Functions with average smoothness: structure, algorithms, and learning 4 months, 2 weeks ago | www.jmlr.org

algorithms analysis complexity function +4

Generative Adversarial Ranking Nets 4 months, 2 weeks ago | www.jmlr.org

Predictive Inference with Weak Supervision 4 months, 2 weeks ago | www.jmlr.org

bridge confidence data framework +12

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions 4 months, 2 weeks ago | www.jmlr.org

approximation beyond diverse function +10

Model-Free Representation Learning and Exploration in Low-Rank MDPs 4 months, 2 weeks ago | www.jmlr.org

algorithms contrast dynamics exploration +9

Effect-Invariant Mechanisms for Policy Generalization 4 months, 2 weeks ago | www.jmlr.org

adapt challenge environments exploit +7

Pygmtools: A Python Graph Matching Toolkit 4 months, 2 weeks ago | www.jmlr.org

applications collection free graph +8

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic 4 months, 2 weeks ago | www.jmlr.org

algorithm components control design +11

Heterogeneous-Agent Reinforcement Learning 4 months, 2 weeks ago | www.jmlr.org

agent agents ai research convergence +10

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net