Feature Learning and Generalization in Deep Networks with Orthogonal Weights | allainews.com

June 13, 2024, 4:49 a.m. | Hannah Day, Yonatan Kahn, Daniel A. Roberts

stat.ML updates on arXiv.org arxiv.org

arXiv:2310.07765v2 Announce Type: replace-cross
Abstract: Fully-connected deep neural networks with weights initialized from independent Gaussian distributions can be tuned to criticality, which prevents the exponential growth or decay of signals propagating through the network. However, such networks still exhibit fluctuations that grow linearly with the depth of the network, which may impair the training of networks with width comparable to depth. We show analytically that rectangular networks with tanh activations and weights initialized from the ensemble of orthogonal matrices have …

abstract arxiv cs.lg feature growth hep-ph hep-th however independent network networks neural networks replace stat.ml through type

More from arxiv.org / stat.ML updates on arXiv.org

Exponential Quantum Communication Advantage in Distributed Inference and Learning 8 hours ago | arxiv.org

abstract architectures arxiv capacity +20

Exact discovery is polynomial for sparse causal Bayesian networks 8 hours ago | arxiv.org

abstract art arxiv bayesian +19

On the use of the Gram matrix for multivariate functional principal components analysis 3 days, 8 hours ago | arxiv.org

abstract analysis arxiv components +15

Proximal Interacting Particle Langevin Algorithms 3 days, 8 hours ago | arxiv.org

abstract algorithm algorithms arxiv +12

Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix 3 days, 8 hours ago | arxiv.org

abstract arxiv bayesian economics +16

Enhancing multivariate post-processed visibility predictions utilizing CAMS forecasts 3 days, 8 hours ago | arxiv.org

abstract accuracy air quality arxiv +18

Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data 3 days, 8 hours ago | arxiv.org

abstract accessibility analysis arxiv +19

Cluster Quilting: Spectral Clustering for Patchwork Learning 3 days, 8 hours ago | arxiv.org

abstract arxiv cluster clustering +18

Evaluation of Missing Data Analytical Techniques in Longitudinal Research: Traditional and Machine Learning Approaches 3 days, 8 hours ago | arxiv.org

abstract arxiv assumptions data +18

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Content Designer

@ Glean | Palo Alto, CA

View on ai-jobs.net

IT&D Data Solution Architect

@ Reckitt | Hyderabad, Telangana, IN, N/A

View on ai-jobs.net

Python Developer

@ Riskinsight Consulting | Hyderabad, Telangana, India

View on ai-jobs.net

Technical Lead (Java/Node.js)

@ LivePerson | Hyderabad, Telangana, India (Remote)

View on ai-jobs.net

Backend Engineer - Senior and Mid-Level - Sydney Hybrid or AU remote

@ Displayr | Sydney, New South Wales, Australia

View on ai-jobs.net