Feb. 9, 2024, 5:43 a.m. | Yichuan Deng Hang Hu Zhao Song Omri Weinstein Danyang Zhuo

cs.LG updates on arXiv.org arxiv.org

The success of deep learning comes at a tremendous computational and energy cost, and the scalability of training massively overparametrized neural networks is becoming a real barrier to the progress of artificial intelligence (AI). Despite the popularity and low cost-per-iteration of traditional backpropagation via gradient decent, stochastic gradient descent (SGD) has prohibitive convergence rate in non-convex settings, both in theory and practice.
To mitigate this cost, recent works have proposed to employ alternative (Newton-type) training methods with much faster convergence …

artificial artificial intelligence backpropagation computational convergence cost cs.ds cs.lg deep learning energy gradient intelligence iteration low networks neural networks per progress scalability stat.ml stochastic success training via

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

ETL Developer

@ Gainwell Technologies | Bengaluru, KA, IN, 560100

Medical Radiation Technologist, Breast Imaging

@ University Health Network | Toronto, ON, Canada

Data Scientist

@ PayPal | USA - Texas - Austin - Corp - Alterra Pkwy