Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging. (arXiv:2209.14981v2 [cs.LG] UPDATED) | allainews.com

Oct. 7, 2022, 1:14 a.m. | Jean Kaddour

stat.ML updates on arXiv.org arxiv.org

Training vision or language models on large datasets can take days, if not
weeks. We show that averaging the weights of the k latest checkpoints, each
collected at the end of an epoch, can speed up the training progression in
terms of loss and accuracy by dozens of epochs, corresponding to time savings
up to ~68 and ~30 GPU hours when training a ResNet50 on ImageNet and
RoBERTa-Base model on WikiText-103, respectively. We also provide the code and
model checkpoint …

arxiv bert imagenet saving training

More from arxiv.org / stat.ML updates on arXiv.org

Simultaneous upper and lower bounds of American option prices with hedging via neural networks 44 minutes ago | arxiv.org

abstract arxiv form math.pr +11

Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF 1 day ago | arxiv.org

accounting arxiv context cs.ai +6

Hacking Task Confounder in Meta-Learning 1 day ago | arxiv.org

abstract arxiv cs.lg hacking +12

Reflection coupling for unadjusted generalized Hamiltonian Monte Carlo in the nonconvex stochastic gradient case 1 day ago | arxiv.org

abstract algorithms arxiv case +10

Provable Reward-Agnostic Preference-Based Reinforcement Learning 1 day ago | arxiv.org

abstract agent arxiv cs.ai +16

Mastering Diverse Domains through World Models 1 day ago | arxiv.org

abstract algorithm algorithms application +22

Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models 1 day ago | arxiv.org

abstract arxiv cs.it cs.lg +14

Additive Covariance Matrix Models: Modelling Regional Electricity Net-Demand in Great Britain 1 day ago | arxiv.org

abstract arxiv britain consumption +18

Learning Algorithm Generalization Error Bounds via Auxiliary Distributions 1 day ago | arxiv.org

abstract algorithm arxiv cs.it +16

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

AI Scientist/Engineer

@ OKX | Singapore

View on ai-jobs.net

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

View on ai-jobs.net

Senior Data Engineer

@ Algolia | London, England

View on ai-jobs.net

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

View on ai-jobs.net

Snowflake Data Analytics

@ Devoteam | Madrid, Spain

View on ai-jobs.net