all AI news
Forward Gradient-Based Frank-Wolfe Optimization for Memory Efficient Deep Neural Network Training
March 20, 2024, 4:41 a.m. | M. Rostami, S. S. Kia
cs.LG updates on arXiv.org arxiv.org
Abstract: Training a deep neural network using gradient-based methods necessitates the calculation of gradients at each level. However, using backpropagation or reverse mode differentiation, to calculate the gradients necessities significant memory consumption, rendering backpropagation an inefficient method for computing gradients. This paper focuses on analyzing the performance of the well-known Frank-Wolfe algorithm, a.k.a. conditional gradient algorithm by having access to the forward mode of automatic differentiation to compute gradients. We provide in-depth technical details that show …
abstract arxiv backpropagation computing consumption cs.lg deep neural network differentiation frank gradient however math.oc memory memory consumption network network training neural network optimization paper rendering training type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne