April 17, 2024, 4:43 a.m. | Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, Zhiyuan Li

cs.LG updates on arXiv.org arxiv.org

arXiv:2307.15196v2 Announce Type: replace
Abstract: Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise. In stochastic optimization, such as training neural networks, folklore suggests that momentum may help deep learning optimization by reducing the variance of the stochastic gradient update, but previous theoretical analyses do not find momentum to offer any provable acceleration. Theoretical results in this paper clarify the role of momentum in stochastic settings where the learning rate is …

abstract arxiv convergence cs.lg deep learning gradient math.oc networks neural networks noise optimization rate small stochastic training type update value variance

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

MLOps Engineer - Hybrid Intelligence

@ Capgemini | Madrid, M, ES

Analista de Business Intelligence (Industry Insights)

@ NielsenIQ | Cotia, Brazil