Web: http://arxiv.org/abs/2206.11124

June 23, 2022, 1:11 a.m. | Maksim Velikanov, Denis Kuznedelev, Dmitry Yarotsky

cs.LG updates on arXiv.org arxiv.org

Mini-batch SGD with momentum is a fundamental algorithm for learning large
predictive models. In this paper we develop a new analytic framework to analyze
mini-batch SGD for linear models at different momenta and sizes of batches. Our
key idea is to describe the loss value sequence in terms of its generating
function, which can be written in a compact form assuming a diagonal
approximation for the second moments of model weights. By analyzing this
generating function, we deduce various conclusions …

arxiv convergence lg negative

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY