Web: http://arxiv.org/abs/2201.12250

Jan. 31, 2022, 2:11 a.m. | Frederik Benzing

cs.LG updates on arXiv.org arxiv.org

Second-order optimizers are thought to hold the potential to speed up neural
network training, but due to the enormous size of the curvature matrix, they
typically require approximations to be computationally tractable. The most
successful family of approximations are Kronecker-Factored, block-diagonal
curvature estimates (KFAC). Here, we combine tools from prior work to evaluate
exact second-order updates with careful ablations to establish a surprising
result: Due to its approximations, KFAC is not closely related to second-order
updates, and in particular, it …

arxiv gradient neurons optimization

More from arxiv.org / cs.LG updates on arXiv.org

Data Analyst, Credit Risk

@ Stripe | US Remote

Senior Data Engineer

@ Snyk | Cluj, Romania, or Remote

Senior Software Engineer (C++), Autonomy Visualization

@ Nuro, Inc. | Mountain View, California (HQ)

Machine Learning Intern (January 2023)

@ Cohere | Toronto, Palo Alto, San Francisco, London

Senior Machine Learning Engineer, Reinforcement Learning, Personalization

@ Spotify | New York, NY

AWS Data Engineer

@ ProCogia | Seattle