Feb. 12, 2024, 9:35 a.m. | /u/Primary-Wasabi292

Machine Learning www.reddit.com

I am wondering if it is worth to go through extensive hyperparameter tuning of model architecture. Learning rate tuning often pays off as this has a big impact on convergence and all around performance, but when tuning architecture (num_layers, num_heads, dropout etc.), I have found if you stay within a certain sweetspot range, the actual performance differences are marginal. Am I doing something wrong? What are your experiences with this?

architecture big convergence dropout etc found hyperparameter impact machinelearning optimisation performance rate strategies through

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

Backend Spark Developer

@ Talan | Warsaw, Poland

Pricing & Data Management Intern

@ Novelis | Atlanta, GA, United States

Sr Data Engineer

@ Visa | Bengaluru, India

Customer Analytics / Data Science - Lead Analyst - Analytics US Timezone

@ dentsu international | Bengaluru, India