all AI news
Train ImageNet without Hyperparameters with Automatic Gradient Descent
April 21, 2023, 3:46 a.m. | Chris Mingard
Towards Data Science - Medium towardsdatascience.com
Towards architecture-aware optimisation
TL;DR We’ve derived an optimiser called automatic gradient descent (AGD) that can train ImageNet without hyperparameters. This removes the need for expensive and time-consuming learning rate tuning, selection of learning rate decay schedulers, etc. Our paper can be found here.
I worked on this project with Jeremy Bernstein, Kevin Huang, Navid Azizan and Yisong Yue. See Jeremy’s GitHub for a clean Pytorch implementation, or my GitHub for an experimental version with more features. …
gradient hyperparameter-tuning imagenet machine learning optimization-algorithms python thoughts-and-theory
More from towardsdatascience.com / Towards Data Science - Medium
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States