Iterate Averaging in the Quest for Best Test Error | allainews.com

Jan. 1, 2024, midnight | Diego Granziol, Nicholas P. Baskerville, Xingchen Wan, Samuel Albanie, Stephen Roberts

JMLR www.jmlr.org

We analyse and explain the increased generalisation performance of iterate averaging using a Gaussian process perturbation model between the true and batch risk surface on the high dimensional quadratic. We derive three phenomena from our theoretical results: (1) The importance of combining iterate averaging (IA) with large learning rates and regularisation for improved generalisation. (2) Justification for less frequent averaging. (3) That we expect adaptive gradient methods to work equally well, or better, with iterate averaging than their non-adaptive counterparts. …

error importance iterate performance process quest risk surface test true

More from www.jmlr.org / JMLR

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions 4 months ago | www.jmlr.org

approximation beyond diverse function +10

Model-Free Representation Learning and Exploration in Low-Rank MDPs 4 months ago | www.jmlr.org

algorithms contrast dynamics exploration +9

Effect-Invariant Mechanisms for Policy Generalization 4 months ago | www.jmlr.org

adapt challenge environments exploit +7

Pygmtools: A Python Graph Matching Toolkit 4 months ago | www.jmlr.org

applications collection free graph +8

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic 4 months ago | www.jmlr.org

algorithm components control design +11

Heterogeneous-Agent Reinforcement Learning 4 months ago | www.jmlr.org

agent agents ai research convergence +10

Sample-efficient Adversarial Imitation Learning 4 months ago | www.jmlr.org

advanced adversarial behavior decision +13

Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent 4 months ago | www.jmlr.org

diffusion dynamics gradient mean +4

Rates of convergence for density estimation with generative adversarial networks 4 months ago | www.jmlr.org

adversarial convergence divergence error +11

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net