Feb. 13, 2024, 5:41 a.m. | Bradley T. Baker Barak A. Pearlmutter Robyn Miller Vince D. Calhoun Sergey M. Plis

cs.LG updates on arXiv.org arxiv.org

Our understanding of learning dynamics of deep neural networks (DNNs) remains incomplete. Recent research has begun to uncover the mathematical principles underlying these networks, including the phenomenon of "Neural Collapse", where linear classifiers within DNNs converge to specific geometrical structures during late-stage training. However, the role of geometric constraints in learning extends beyond this terminal phase. For instance, gradients in fully-connected layers naturally develop a low-rank structure due to the accumulation of rank-one outer products over a training batch. Despite …

architecture begun classifiers converge cs.lg design dynamics gradient linear low network network architecture networks neural collapse neural networks research role stage training understanding

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote