all AI news
Understanding Gradient Descent on Edge of Stability in Deep Learning. (arXiv:2205.09745v1 [cs.LG])
May 20, 2022, 1:12 a.m. | Sanjeev Arora, Zhiyuan Li, Abhishek Panigrahi
cs.LG updates on arXiv.org arxiv.org
Deep learning experiments in Cohen et al. (2021) using deterministic Gradient
Descent (GD) revealed an {\em Edge of Stability (EoS)} phase when learning rate
(LR) and sharpness (\emph{i.e.}, the largest eigenvalue of Hessian) no longer
behave as in traditional optimization. Sharpness stabilizes around $2/$LR and
loss goes up and down across iterations, yet still with an overall downward
trend. The current paper mathematically analyzes a new mechanism of implicit
regularization in the EoS phase, whereby GD updates due to non-smooth …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
(373) Applications Manager – Business Intelligence - BSTD
@ South African Reserve Bank | South Africa
Data Engineer Talend (confirmé/sénior) - H/F - CDI
@ Talan | Paris, France
Data Science Intern (Summer) / Stagiaire en données (été)
@ BetterSleep | Montreal, Quebec, Canada
Director - Master Data Management (REMOTE)
@ Wesco | Pittsburgh, PA, United States
Architect Systems BigData REF2649A
@ Deutsche Telekom IT Solutions | Budapest, Hungary
Data Product Coordinator
@ Nestlé | São Paulo, São Paulo, BR, 04730-000