March 5, 2024, 2:42 p.m. | Damien Teney, Armand Nicolicioiu, Valentin Hartmann, Ehsan Abbasnejad

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.02241v1 Announce Type: new
Abstract: Our understanding of the generalization capabilities of neural networks (NNs) is still incomplete. Prevailing explanations are based on implicit biases of gradient descent (GD) but they cannot account for the capabilities of models from gradient-free methods nor the simplicity bias recently observed in untrained networks. This paper seeks other sources of generalization in NNs.
Findings. To understand the inductive biases provided by architectures independently from GD, we examine untrained, random-weight networks. Even simple MLPs show …

abstract arxiv bias biases capabilities cs.ai cs.cv cs.lg free functions gradient networks neural networks nns paper random redshift simplicity type understanding

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-

@ JPMorgan Chase & Co. | Wilmington, DE, United States

Senior ML Engineer (Speech/ASR)

@ ObserveAI | Bengaluru