May 25, 2022, 1:10 a.m. | Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

cs.LG updates on arXiv.org arxiv.org

In this work, we propose using a quadratic model as a tool for understanding
properties of wide neural networks in both optimization and generalization. We
show analytically that certain deep learning phenomena such as the "catapult
phase" from [Lewkowycz et al. 2020], which cannot be captured by linear models,
are manifested in the quadratic model for shallow ReLU networks. Furthermore,
our empirical results indicate that the behaviour of quadratic models parallels
that of neural networks in generalization, especially in the …

arxiv dynamics network neural network understanding

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Staff Software Engineer, Generative AI, Google Cloud AI

@ Google | Mountain View, CA, USA; Sunnyvale, CA, USA

Expert Data Sciences

@ Gainwell Technologies | Any city, CO, US, 99999