all AI news
[D] On initialization schemes for MLPs: practice and theory
Web: https://www.reddit.com/r/MachineLearning/comments/sdjqab/d_on_initialization_schemes_for_mlps_practice_and/
Jan. 26, 2022, 11:31 p.m. | /u/carlml
Machine Learning reddit.com
In this post I am thinking only about MLPs with ReLU activation function.
The default pytorch initialization for linear layers is from a uniform distribution centered at 0 whose limits values depends on the input dimension. Many papers assume initialization from a Gaussian distribution with 0 mean and provide a certain variance.
There is also this work [Pennington (2017)] that proposes orthogonal initialization to achieve what they call dynamical isometry which means that the input-output Jacobian is 1 (or stays …
!-->More from reddit.com / Machine Learning
Latest AI/ML/Big Data Jobs
Senior Data Analyst
@ Fanatics Inc | Remote - New York
Data Engineer - Search
@ Cytora | United Kingdom - Remote
Product Manager, Technical - Data Infrastructure and Streaming
@ Nubank | Berlin
Postdoctoral Fellow: ML for autonomous materials discovery
@ Lawrence Berkeley National Lab | Berkeley, CA
Principal Data Scientist
@ Zuora | Remote
Data Engineer
@ Veeva Systems | Pennsylvania - Fort Washington