Feb. 12, 2024, 9:35 a.m. | /u/Primary-Wasabi292

Machine Learning www.reddit.com

I am wondering if it is worth to go through extensive hyperparameter tuning of model architecture. Learning rate tuning often pays off as this has a big impact on convergence and all around performance, but when tuning architecture (num_layers, num_heads, dropout etc.), I have found if you stay within a certain sweetspot range, the actual performance differences are marginal. Am I doing something wrong? What are your experiences with this?

architecture big convergence dropout etc found hyperparameter impact machinelearning optimisation performance rate strategies through

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Manager

@ Sanofi | Budapest

Principal Engineer, Data (Hybrid)

@ Homebase | Toronto, Ontario, Canada