Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling | allainews.com

June 18, 2024, 4:50 a.m. | Jakob Hollenstein, Georg Martius, Justus Piater

cs.LG updates on arXiv.org arxiv.org

arXiv:2312.11091v2 Announce Type: replace
Abstract: Proximal Policy Optimization (PPO), a popular on-policy deep reinforcement learning method, employs a stochastic policy for exploration. In this paper, we propose a colored noise-based stochastic policy variant of PPO. Previous research highlighted the importance of temporal correlation in action noise for effective exploration in off-policy reinforcement learning. Building on this, we investigate whether correlated noise can also enhance exploration in on-policy methods like PPO. We discovered that correlated noise for action selection improves learning …

abstract action arxiv correlation cs.lg exploration importance noise optimization paper performance policy popular ppo reinforcement reinforcement learning replace research sampling stochastic temporal through type

More from arxiv.org / cs.LG updates on arXiv.org

Scientific Machine Learning Based Reduced-Order Models for Plasma Turbulence Simulations 12 hours ago | arxiv.org

abstract arxiv build construction +20

LEDITS++: Limitless Image Editing using Text-to-Image Models 12 hours ago | arxiv.org

abstract aim apply arxiv +22

InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates 12 hours ago | arxiv.org

abstract applications arxiv challenges +22

Multimodal and Force-Matched Imitation Learning with a See-Through Visuotactile Sensor 12 hours ago | arxiv.org

abstract arxiv challenges cs.ai +16

Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: Task Formulations and Machine Learning Methods 12 hours ago | arxiv.org

abstract applications arxiv attention +19

Autoencoder-based Anomaly Detection System for Online Data Quality Monitoring of the CMS Electromagnetic Calorimeter 12 hours ago | arxiv.org

abstract anomaly anomaly detection arxiv +20

Gradient Coding with Iterative Block Leverage Score Sampling 12 hours ago | arxiv.org

abstract arxiv block coding +17

Contextual Dynamic Pricing with Strategic Buyers 12 hours ago | arxiv.org

abstract arxiv behavior consumer +18

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems 12 hours ago | arxiv.org

abstract agent arxiv context +19

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

PhD Student AI simulation electric drive (f/m/d)

@ Volkswagen Group | Kassel, DE, 34123

View on ai-jobs.net

AI Privacy Research Lead

@ Leidos | 6314 Remote/Teleworker US

View on ai-jobs.net

Senior Platform System Architect, Silicon

@ Google | New Taipei, Banqiao District, New Taipei City, Taiwan

View on ai-jobs.net

Fabrication Hardware Litho Engineer, Quantum AI

@ Google | Goleta, CA, USA

View on ai-jobs.net