Web: http://arxiv.org/abs/2206.11357

June 24, 2022, 1:10 a.m. | Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Ch

cs.LG updates on arXiv.org arxiv.org

Training large neural network (NN) models requires extensive memory
resources, and Activation Compressed Training (ACT) is a promising approach to
reduce training memory footprint. This paper presents GACT, an ACT framework to
support a broad range of machine learning tasks for generic NN architectures
with limited domain knowledge. By analyzing a linearized version of ACT's
approximate gradient, we prove the convergence of GACT without prior knowledge
on operator type or model architecture. To make training stable, we propose an
algorithm …

arxiv general lg training

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY