March 8, 2024, 5:43 a.m. | David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip

stat.ML updates on arXiv.org arxiv.org

arXiv:2103.04392v3 Announce Type: replace-cross
Abstract: Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective $f$ and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of executing a single step along the negative subsampled gradient direction to update the incumbent iterate. In this paper, we question SG's choice of executing a single step as opposed to multiple steps between subsample updates. Our investigation leads naturally to …

abstract approximation arxiv gradient iterative math.oc negative optimization oracle part retrospective simplicity solve stat.ml stochastic type update

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Codec Avatars Research Engineer

@ Meta | Pittsburgh, PA