Peeking Behind the Curtains of Residual Learning | allainews.com

Feb. 14, 2024, 5:43 a.m. | Tunhou Zhang Feng Yan Hai Li Yiran Chen

cs.LG updates on arXiv.org arxiv.org

The utilization of residual learning has become widespread in deep and scalable neural nets. However, the fundamental principles that contribute to the success of residual learning remain elusive, thus hindering effective training of plain nets with depth scalability. In this paper, we peek behind the curtains of residual learning by uncovering the "dissipating inputs" phenomenon that leads to convergence failure in plain neural nets: the input is gradually compromised through plain layers due to non-linearities, resulting in challenges of learning …

become cs.cv cs.lg neural nets paper residual scalability scalable success training

More from arxiv.org / cs.LG updates on arXiv.org

Gland Segmentation Via Dual Encoders and Boundary-Enhanced Attention 1 day, 7 hours ago | arxiv.org

abstract arxiv attention automated +8

Sliced Wasserstein with Random-Path Projecting Directions 1 day, 7 hours ago | arxiv.org

abstract applications arxiv cs.ai +12

TIM: An Efficient Temporal Interaction Module for Spiking Transformer 1 day, 7 hours ago | arxiv.org

arxiv cs.cv cs.lg cs.ne +3

Accuracy vs Memory Advantage in the Quantum Simulation of Stochastic Processes 1 day, 7 hours ago | arxiv.org

abstract accuracy arxiv assumptions +20

Accelerating Inference in Molecular Diffusion Models with Latent Representations of Protein Structure 1 day, 7 hours ago | arxiv.org

abstract arxiv biology cs.lg +18

Large Language Models can Strategically Deceive their Users when Put Under Pressure 1 day, 7 hours ago | arxiv.org

abstract agent arxiv behavior +11

Learning Extrinsic Dexterity with Parameterized Manipulation Primitives 1 day, 7 hours ago | arxiv.org

arxiv cs.lg cs.ro manipulation +1

The Un-Kidnappable Robot: Acoustic Localization of Sneaking People 1 day, 7 hours ago | arxiv.org

arxiv cs.lg cs.ro localization +3

Diffusion Models as Stochastic Quantization in Lattice Field Theory 1 day, 7 hours ago | arxiv.org

abstract arxiv cs.lg diffusion +15

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net