Web: https://www.reddit.com/r/MachineLearning/comments/up31we/r_scaling_tasks_during_pretraining_may_be_scaling/

May 13, 2022, 10:25 p.m. | /u/Competitive-Rub-1958

Machine Learning reddit.com


> ...This leads to a crucial discovery that task scaling can be an efficient alternative to model scaling; i.e., the model size has little impact on performance with an extremely large number of tasks. Our results show that task scaling can substantially improve training efficiency by 30 times in FLOPs..

*tl;dr* scale the amount of tasks as well as data, compute, hyperparameters, FLOPs and *prayers* for effectively training LLMs.

machinelearning model pre-training scaling training

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC