Dec. 7, 2023, 8:08 a.m. | Luis Medina

Towards Data Science - Medium towardsdatascience.com

The main optimization algorithms used for training neural networks, explained and implemented from scratch in Python

Photo by Jack AnsteyUnsplash

In my previous article about gradient descent, I explained the basic concepts behind it and summarized the main challenges of this kind of optimization.

However, I only covered Stochastic Gradient Descent (SGD) and the “batch” and “mini-batch” implementation of gradient descent.

Other algorithms offer advantages in terms of convergence speed, robustness to “landscape” features (the vanishing gradient …

data science getting-started machine learning optimization-algorithms programming

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Scientist, Commercial Analytics

@ Checkout.com | London, United Kingdom

Data Engineer I

@ Love's Travel Stops | Oklahoma City, OK, US, 73120