all AI news
A Theoretical and Empirical Study on the Convergence of Adam with an "Exact" Constant Step Size in Non-Convex Settings
April 5, 2024, 4:42 a.m. | Alokendu Mazumder, Rishabh Sabharwal, Manan Tayal, Bhartendu Kumar, Punit Rathore
cs.LG updates on arXiv.org arxiv.org
Abstract: In neural network training, RMSProp and Adam remain widely favoured optimisation algorithms. One of the keys to their performance lies in selecting the correct step size, which can significantly influence their effectiveness. Additionally, questions about their theoretical convergence properties continue to be a subject of interest. In this paper, we theoretically analyse a constant step size version of Adam in the non-convex setting and discuss why it is important for the convergence of Adam to …
abstract adam algorithms arxiv convergence cs.lg influence keys lies math.oc network network training neural network optimisation performance questions study training type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Principal, Product Strategy Operations, Cloud Data Analytics
@ Google | Sunnyvale, CA, USA; Austin, TX, USA
Data Scientist - HR BU
@ ServiceNow | Hyderabad, India