June 17, 2022, 1:11 a.m. | Romain Cosson, Ali Jadbabaie, Anuran Makur, Amirhossein Reisizadeh, Devavrat Shah

Several recent empirical studies demonstrate that important machine learning
tasks, e.g., training deep neural networks, exhibit low-rank structure, where
the loss function varies significantly in only a few directions of the input
space. In this paper, we leverage such low-rank structure to reduce the high
computational cost of canonical gradient-based methods such as gradient descent
(GD). Our proposed \emph{Low-Rank Gradient Descent} (LRGD) algorithm finds an
$\epsilon$-approximate stationary point of a $p$-dimensional function by first
identifying $r \leq p$ significant directions, …

