all AI news
Dropout in Transformer
Dec. 29, 2023, 7 a.m. | /u/ytu876
Deep Learning www.reddit.com
I'm following [Coding a transformer from scratch](https://www.youtube.com/watch?v=ISNdQcPhsts), and have a question about dropout. What's the criteria for a dropout to be present in a component? My understanding is that dropout is there to prevent overfitting.
1. InputEmbedding has no dropout
2. LayerNormalization has no dropout
3. But things as simple as ResidualConnection (i.e. the Add in the "Add + Norm" part) has dropout
Is there any rule to determine whether a component should have dropout or not?
Thanks.
More from www.reddit.com / Deep Learning
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne