all AI news
A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta. (arXiv:2206.11124v1 [cs.LG])
Web: http://arxiv.org/abs/2206.11124
June 23, 2022, 1:11 a.m. | Maksim Velikanov, Denis Kuznedelev, Dmitry Yarotsky
cs.LG updates on arXiv.org arxiv.org
Mini-batch SGD with momentum is a fundamental algorithm for learning large
predictive models. In this paper we develop a new analytic framework to analyze
mini-batch SGD for linear models at different momenta and sizes of batches. Our
key idea is to describe the loss value sequence in terms of its generating
function, which can be written in a compact form assuming a diagonal
approximation for the second moments of model weights. By analyzing this
generating function, we deduce various conclusions …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY