Web: https://www.reddit.com/r/MachineLearning/comments/ulk71e/r_happy_to_share_my_paper_and_python_code_on/

May 9, 2022, 5:35 a.m. | /u/alexsht1

Machine Learning reddit.com

Paper: [https://arxiv.org/abs/2205.01457](https://arxiv.org/abs/2205.01457)
Code: [https://github.com/alexshtf/inc\_prox\_pt](https://github.com/alexshtf/inc_prox_pt)

Models are often trained using variants of the gradient update rule:

xₜ₊₁ = xₜ-β∇ƒ(xₜ)

where x is the model parameters vector, and ƒ is the cost function associated with the current (mini-batch of) training sample. This rule has another well-known interpretation - the proximal view:

xₜ₊₁ = argmin { ƒ(xₜ) + ⟨∇ƒ(xₜ), x-xₜ⟩ + β/2 ‖x-xₜ‖² },

meaning "balance between minimizing a linear approx. of ƒ at xₜ and being close to xₜ". The step-size β …

code implementation incremental learning machine machinelearning models on paper python training

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC