Web: http://arxiv.org/abs/2201.11968

Jan. 31, 2022, 2:11 a.m. | Thien Le, Stefanie Jegelka

cs.LG updates on arXiv.org arxiv.org

The implicit bias induced by the training of neural networks has become a
topic of rigorous study. In the limit of gradient flow and gradient descent
with appropriate step size, it has been shown that when one trains a deep
linear network with logistic or exponential loss on linearly separable data,
the weights converge to rank-$1$ matrices. In this paper, we extend this
theoretical result to the much wider class of nonlinear ReLU-activated
feedforward networks containing fully-connected layers and skip …

arxiv networks training

More from arxiv.org / cs.LG updates on arXiv.org

Data Analytics and Technical support Lead

@ Coupa Software, Inc. | Bogota, Colombia

Data Science Manager

@ Vectra | San Jose, CA

Data Analyst Sr

@ Capco | Brazil - Sao Paulo

Data Scientist (NLP)

@ Builder.ai | London, England, United Kingdom - Remote

Senior Data Analyst

@ BuildZoom | Scottsdale, AZ/ San Francisco, CA/ Remote

Senior Research Scientist, Speech Recognition

@ SoundHound Inc. | Toronto, Canada