May 17, 2022, 9:33 a.m. | /u/pseudo_random_here

Machine Learning www.reddit.com

So the LogSoftmax algorithm has gained some popularity in the Deep Learning community for the computational performance and gradient optimization benefits it offers over Softmax. Since I have always been using it as a final layer activation function, this never occurred to me. But using a LogSoftmax repeatedly as a hidden layer activation (due to back propagation matters) would pretty much serve no purpose.

**Q:** wondering if log(sum(exp(x\_i))) is implemented as max(X) since the largest value in X would practically …

deep learning exercise fun learning machinelearning

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Stagista Technical Data Engineer

@ Hager Group | BRESCIA, IT

Data Analytics - SAS, SQL - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India