Feb. 15, 2024, 5:42 a.m. | Akshay Kumar, Jarvis Haupt

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.09226v1 Announce Type: new
Abstract: This paper examines gradient flow dynamics of two-homogeneous neural networks for small initializations, where all weights are initialized near the origin. For both square and logistic losses, it is shown that for sufficiently small initializations, the gradient flow dynamics spend sufficient time in the neighborhood of the origin to allow the weights of the neural network to approximately converge in direction to the Karush-Kuhn-Tucker (KKT) points of a neural correlation function that quantifies the correlation …

abstract arxiv convergence cs.lg dynamics flow gradient losses math.oc near networks neural networks paper small spend square stat.ml type

