all AI news
Momentum-Based Policy Gradient with Second-Order Information. (arXiv:2205.08253v2 [cs.LG] UPDATED)
Aug. 19, 2022, 1:11 a.m. | Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran
cs.LG updates on arXiv.org arxiv.org
Variance-reduced gradient estimators for policy gradient methods have been
one of the main focus of research in the reinforcement learning in recent years
as they allow acceleration of the estimation process. We propose a
variance-reduced policy-gradient method, called SHARP, which incorporates
second-order information into stochastic gradient descent (SGD) using momentum
with a time-varying learning rate. SHARP algorithm is parameter-free, achieving
$\epsilon$-approximate first-order stationary point with $O(\epsilon^{-3})$
number of trajectories, while using a batch size of $O(1)$ at each iteration.
Unlike …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Scientist
@ Motive | India - Remote
Senior Perception Engineer
@ NVIDIA | US, CA, Santa Clara
Business Data Analyst, Finance and Treasury Data Repositories, Senior Associate
@ State Street | Krakow, Poland
Junior AI Engineer (Internship)
@ Sony | SEU - Italy - Roma
Manager, Data Science 3
@ PayPal | USA - Pennsylvania - Virtual