Jan. 17, 2022, 2:10 a.m. | Li Wang (1), Yingcong Zhou (2), Zhiguo Fu (1) ((1) Northeast Normal University, (2) Beihua University)

cs.LG updates on arXiv.org arxiv.org

The study on the implicit regularization induced by gradient-based
optimization is a longstanding pursuit. In the present paper, we characterize
the implicit regularization of momentum gradient descent (MGD) with early
stopping by comparing with the explicit $\ell_2$-regularization (ridge). In
details, we study MGD in the continuous-time view, so-called momentum gradient
flow (MGF), and show that its tendency is closer to ridge than the gradient
descent (GD) [Ali et al., 2019] for least squares regression. Moreover, we
prove that, under the …

arxiv gradient

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Analyst

@ Rappi | COL-Bogotá

Applied Scientist II

@ Microsoft | Redmond, Washington, United States