On the Generalization of Stochastic Gradient Descent with Momentum | allainews.com

Jan. 1, 2024, midnight | Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher, Ashish Khisti, Ben Liang

JMLR www.jmlr.org

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which the stability gap for multiple epochs of SGD with standard heavy-ball momentum (SGDM) becomes unbounded. Then, for smooth Lipschitz loss functions, we analyze a modified momentum-based update rule, i.e., SGD with early momentum (SGDEM) under a …

error function gap gradient loss machine machine learning machine learning models multiple show stability stochastic training understanding variants work

More from www.jmlr.org / JMLR

Functions with average smoothness: structure, algorithms, and learning 5 months, 4 weeks ago | www.jmlr.org

algorithms analysis complexity function +4

Generative Adversarial Ranking Nets 5 months, 4 weeks ago | www.jmlr.org

Predictive Inference with Weak Supervision 5 months, 4 weeks ago | www.jmlr.org

bridge confidence data framework +12

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions 5 months, 4 weeks ago | www.jmlr.org

approximation beyond diverse function +10

Model-Free Representation Learning and Exploration in Low-Rank MDPs 5 months, 4 weeks ago | www.jmlr.org

algorithms contrast dynamics exploration +9

Effect-Invariant Mechanisms for Policy Generalization 5 months, 4 weeks ago | www.jmlr.org

adapt challenge environments exploit +7

Pygmtools: A Python Graph Matching Toolkit 5 months, 4 weeks ago | www.jmlr.org

applications collection free graph +8

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic 5 months, 4 weeks ago | www.jmlr.org

algorithm components control design +11

Heterogeneous-Agent Reinforcement Learning 5 months, 4 weeks ago | www.jmlr.org

agent agents ai research convergence +10

Data Scientist

@ Ford Motor Company | Chennai, Tamil Nadu, India

View on ai-jobs.net

Systems Software Engineer, Graphics

@ Parallelz | Vancouver, British Columbia, Canada - Remote

View on ai-jobs.net

Engineering Manager - Geo Engineering Team (F/H/X)

@ AVIV Group | Paris, France

View on ai-jobs.net

Data Analyst

@ Microsoft | San Antonio, Texas, United States

View on ai-jobs.net

Azure Data Engineer

@ TechVedika | Hyderabad, India

View on ai-jobs.net

Senior Data & AI Threat Detection Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

View on ai-jobs.net