all AI news
Convergence of Gradient Descent for Recurrent Neural Networks: A Nonasymptotic Analysis
Feb. 20, 2024, 5:42 a.m. | Semih Cayci, Atilla Eryilmaz
cs.LG updates on arXiv.org arxiv.org
Abstract: We analyze recurrent neural networks trained with gradient descent in the supervised learning setting for dynamical systems, and prove that gradient descent can achieve optimality \emph{without} massive overparameterization. Our in-depth nonasymptotic analysis (i) provides sharp bounds on the network size $m$ and iteration complexity $\tau$ in terms of the sequence length $T$, sample size $n$ and ambient dimension $d$, and (ii) identifies the significant impact of long-term dependencies in the dynamical system on the convergence …
abstract analysis analyze arxiv complexity convergence cs.lg gradient iteration massive math.oc network networks neural networks prove recurrent neural networks stat.ml supervised learning systems type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Codec Avatars Research Engineer
@ Meta | Pittsburgh, PA