all AI news
On the Maximum Hessian Eigenvalue and Generalization. (arXiv:2206.10654v1 [cs.LG])
June 23, 2022, 1:12 a.m. | Simran Kaur, Jeremy Cohen, Zachary C. Lipton
stat.ML updates on arXiv.org arxiv.org
The mechanisms by which certain training interventions, such as increasing
learning rates and applying batch normalization, improve the generalization of
deep networks remains a mystery. Prior works have speculated that "flatter"
solutions generalize better than "sharper" solutions to unseen data, motivating
several metrics for measuring flatness (particularly $\lambda_{max}$, the
largest eigenvalue of the Hessian of the loss); and algorithms, such as
Sharpness-Aware Minimization (SAM) [1], that directly optimize for flatness.
Other works question the link between $\lambda_{max}$ and generalization. In …
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
(373) Applications Manager – Business Intelligence - BSTD
@ South African Reserve Bank | South Africa
Data Engineer Talend (confirmé/sénior) - H/F - CDI
@ Talan | Paris, France
Data Science Intern (Summer) / Stagiaire en données (été)
@ BetterSleep | Montreal, Quebec, Canada
Director - Master Data Management (REMOTE)
@ Wesco | Pittsburgh, PA, United States
Architect Systems BigData REF2649A
@ Deutsche Telekom IT Solutions | Budapest, Hungary
Data Product Coordinator
@ Nestlé | São Paulo, São Paulo, BR, 04730-000