all AI news
[P] Silly project: implement MLP using a transformer (yo, dawg...)
Dec. 6, 2023, 9:28 p.m. | /u/killerstorm
Machine Learning www.reddit.com
E.g. consider GPT-Neo-350M. Each MLP has 1024-element input and output layers and 4096-element inner layer. This requires 2x1024x4096 = 4.2M weights. Whole model has 24 of them (one per layer), resulting in 100M weights being used for MLPs.
And yet these MLPs basically just do 4096 dot products to calculate features. Millions of weights just to calculate …
element facts gpt gpt-neo layer llms machinelearning mlp neo paper project recall them transformer transformers
More from www.reddit.com / Machine Learning
[D] software to design figures
13 hours ago |
www.reddit.com
[Discussion] Should I go to ICML and present my paper?
1 day, 6 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States