Sept. 13, 2023, 3:07 a.m. | /u/ginger_turmeric

Machine Learning

Currently I'm training medium (1B-3B) sized audio models. I have several different architectures in mind. Obviously I don't want to train the full-sized models and then compare them, thats a waste of money. So I'm thinking of training smaller versions (\~100M) and then comparing those instead.

My question is there some sort of best practice for this? Some smaller multiple of your full model size where it is best to compare? Thanks.

architectures audio guidance machinelearning medium mind money them thinking training versions waste

Staff Research Scientist, AI/ML

@ Chan Zuckerberg Initiative | Redwood City, CA

Senior Machine Learning Engineer, Science

@ Chan Zuckerberg Initiative | Redwood City, California

AI Innovator in Healthcare

@ GAIA AG | Remote, Germany

Senior Machine Learning Engineer

@ Kintsugi | remote

Staff Machine Learning Engineer (Tech Lead)

@ Kintsugi | Remote

R_00029290 Lead Data Modeler – Remote

@ University at Buffalo | Austin, TX