Enhancing Policy Gradient with the Polyak Step-Size Adaption | allainews.com

April 12, 2024, 4:41 a.m. | Yunxiang Li, Rui Yuan, Chen Fan, Mark Schmidt, Samuel Horv\'ath, Robert M. Gower, Martin Tak\'a\v{c}

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.07525v1 Announce Type: new
Abstract: Policy gradient is a widely utilized and foundational algorithm in the field of reinforcement learning (RL). Renowned for its convergence guarantees and stability compared to other RL algorithms, its practical application is often hindered by sensitivity to hyper-parameters, particularly the step-size. In this paper, we introduce the integration of the Polyak step-size in RL, which automatically adjusts the step-size without prior knowledge. To adapt this method to RL settings, we address several issues, including unknown …

abstract algorithm algorithms application arxiv convergence cs.lg foundational gradient paper parameters policy practical reinforcement reinforcement learning sensitivity stability type

More from arxiv.org / cs.LG updates on arXiv.org

Course Recommender Systems Need to Consider the Job Market 13 hours ago | arxiv.org

abstract arxiv course cs.ir +16

$\texttt{immrax}$: A Parallelizable and Differentiable Toolbox for Interval Analysis and Mixed Monotone Reachability in JAX 13 hours ago | arxiv.org

abstract analysis arxiv compilation +18

Thousands of AI Authors on the Future of AI 13 hours ago | arxiv.org

abstract advanced advanced ai ai progress +21

Graphene: Infrastructure Security Posture Analysis with AI-generated Attack Graphs 13 hours ago | arxiv.org

abstract analysis arxiv assessment +24

Volume-Preserving Transformers for Learning Time Series Data with Structure 13 hours ago | arxiv.org

abstract arxiv cs.lg cs.na +24

Eureka: Human-Level Reward Design via Coding Large Language Models 13 hours ago | arxiv.org

abstract algorithm arxiv bridge +25

Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks 13 hours ago | arxiv.org

abstract arxiv attention cs.lg +11

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search 13 hours ago | arxiv.org

abstract arxiv become compression +24

Gaussian random field approximation via Stein's method with applications to wide random neural networks 13 hours ago | arxiv.org

abstract applications approximation arxiv +14

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

#13721 - Data Engineer - AI Model Testing

@ Qualitest | Miami, Florida, United States

View on ai-jobs.net

Elasticsearch Administrator

@ ManTech | 201BF - Customer Site, Chantilly, VA

View on ai-jobs.net