Aug. 5, 2022, 11:42 a.m. | /u/user_--

Machine Learning www.reddit.com

These papers indicate that when a large model is trained on a small dataset for a very long time, the test loss first goes down, then up when it overfits, but eventually back down even lower, and the model generalizes correctly.

Do people take advantage of this in practice to get good, generalized models on small datasets? Do people often train longer now in order to get better models? Or has this not caught on in practice for some reason? …

machinelearning people

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Engineer

@ Paxos | Remote - United States

Data Analytics Specialist

@ Media.Monks | Kuala Lumpur

Software Engineer III- Pyspark

@ JPMorgan Chase & Co. | India

Engineering Manager, Data Infrastructure

@ Dropbox | Remote - Canada

Senior AI NLP Engineer

@ Hyro | Tel Aviv-Yafo, Tel Aviv District, Israel