June 2, 2022, 1:11 a.m. | Jiahui Yu, Konstantinos Spiliopoulos

stat.ML updates on arXiv.org arxiv.org

We consider shallow (single hidden layer) neural networks and characterize
their performance when trained with stochastic gradient descent as the number
of hidden units $N$ and gradient descent steps grow to infinity. In particular,
we investigate the effect of different scaling schemes, which lead to different
normalizations of the neural network, on the network's statistical output,
closing the gap between the $1/\sqrt{N}$ and the mean-field $1/N$
normalization. We develop an asymptotic expansion for the neural network's
statistical output pointwise with …

arxiv effects ml networks neural networks normalization

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Business Intelligence Developer / Analyst

@ Transamerica | Work From Home, USA

Data Analyst (All Levels)

@ Noblis | Bethesda, MD, United States