all AI news
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Feb. 21, 2024, 5:41 a.m. | Akash Guna R. T, Arnav Chavan, Deepak Gupta
cs.LG updates on arXiv.org arxiv.org
Abstract: Conventional scaling of neural networks typically involves designing a base network and growing different dimensions like width, depth, etc. of the same by some predefined scaling factors. We introduce an automated scaling approach leveraging second-order loss landscape information. Our method is flexible towards skip connections a mainstay in modern vision transformers. Our training-aware method jointly scales and trains transformers without additional training iterations. Motivated by the hypothesis that not all neurons need uniform depth complexity, …
abstract architectures arxiv automated beyond cs.ai cs.lg cs.ne designing dimensions etc information landscape loss network networks neural architectures neural networks scaling type uniform
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Scientist
@ ITE Management | New York City, United States