On the Maximum Hessian Eigenvalue and Generalization. (arXiv:2206.10654v1 [cs.LG]) | allainews.com

June 23, 2022, 1:12 a.m. | Simran Kaur, Jeremy Cohen, Zachary C. Lipton

stat.ML updates on arXiv.org arxiv.org

The mechanisms by which certain training interventions, such as increasing
learning rates and applying batch normalization, improve the generalization of
deep networks remains a mystery. Prior works have speculated that "flatter"
solutions generalize better than "sharper" solutions to unseen data, motivating
several metrics for measuring flatness (particularly $\lambda_{max}$, the
largest eigenvalue of the Hessian of the loss); and algorithms, such as
Sharpness-Aware Minimization (SAM) [1], that directly optimize for flatness.
Other works question the link between $\lambda_{max}$ and generalization. In …

arxiv eigenvalue lg

More from arxiv.org / stat.ML updates on arXiv.org

Simultaneous upper and lower bounds of American option prices with hedging via neural networks 9 hours ago | arxiv.org

abstract arxiv form math.pr +11

Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF 1 day, 9 hours ago | arxiv.org

accounting arxiv context cs.ai +6

Hacking Task Confounder in Meta-Learning 1 day, 9 hours ago | arxiv.org

abstract arxiv cs.lg hacking +12

Reflection coupling for unadjusted generalized Hamiltonian Monte Carlo in the nonconvex stochastic gradient case 1 day, 9 hours ago | arxiv.org

abstract algorithms arxiv case +10

Provable Reward-Agnostic Preference-Based Reinforcement Learning 1 day, 9 hours ago | arxiv.org

abstract agent arxiv cs.ai +16

Mastering Diverse Domains through World Models 1 day, 9 hours ago | arxiv.org

abstract algorithm algorithms application +22

Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models 1 day, 9 hours ago | arxiv.org

abstract arxiv cs.it cs.lg +14

Additive Covariance Matrix Models: Modelling Regional Electricity Net-Demand in Great Britain 1 day, 9 hours ago | arxiv.org

abstract arxiv britain consumption +18

Learning Algorithm Generalization Error Bounds via Auxiliary Distributions 1 day, 9 hours ago | arxiv.org

abstract algorithm arxiv cs.it +16

(373) Applications Manager – Business Intelligence - BSTD

@ South African Reserve Bank | South Africa

View on ai-jobs.net

Data Engineer Talend (confirmé/sénior) - H/F - CDI

@ Talan | Paris, France

View on ai-jobs.net

Data Science Intern (Summer) / Stagiaire en données (été)

@ BetterSleep | Montreal, Quebec, Canada

View on ai-jobs.net

Director - Master Data Management (REMOTE)

@ Wesco | Pittsburgh, PA, United States

View on ai-jobs.net

Architect Systems BigData REF2649A

@ Deutsche Telekom IT Solutions | Budapest, Hungary

View on ai-jobs.net

Data Product Coordinator

@ Nestlé | São Paulo, São Paulo, BR, 04730-000

View on ai-jobs.net