Web: http://arxiv.org/abs/2112.00029

May 12, 2022, 1:11 a.m. | Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré

cs.LG updates on arXiv.org arxiv.org

Overparameterized neural networks generalize well but are expensive to train.
Ideally, one would like to reduce their computational cost while retaining
their generalization benefits. Sparse model training is a simple and promising
approach to achieve this, but there remain challenges as existing methods
struggle with accuracy loss, slow training runtime, or difficulty in
sparsifying all model components. The core problem is that searching for a
sparsity mask over a discrete set of sparse matrices is difficult and
expensive. To address …

arxiv models network neural neural network training

More from arxiv.org / cs.LG updates on arXiv.org

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote

Director of AI/ML Engineering

@ Armis Industries | Remote (US only), St. Louis, California

Digital Analytics Manager

@ Patagonia | Ventura, California