June 10, 2024, 4:45 a.m. | Si Yi Meng, Antonio Orvieto, Daniel Yiming Cao, Christopher De Sa

cs.LG updates on arXiv.org arxiv.org

arXiv:2406.05033v1 Announce Type: new
Abstract: We study gradient descent (GD) dynamics on logistic regression problems with large, constant step sizes. For linearly-separable data, it is known that GD converges to the minimizer with arbitrarily large step sizes, a property which no longer holds when the problem is not separable. In fact, the behaviour can be much more complex -- a sequence of period-doubling bifurcations begins at the critical step size $2/\lambda$, where $\lambda$ is the largest eigenvalue of the Hessian …

