March 26, 2024, 4:41 a.m. | Tongle Wu, Ying Sun

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.15654v1 Announce Type: new
Abstract: We revisit two fundamental decentralized optimization methods, Decentralized Gradient Tracking (DGT) and Decentralized Gradient Descent (DGD), with multiple local updates. We consider two settings and demonstrate that incorporating $K > 1$ local update steps can reduce communication complexity. Specifically, for $\mu$-strongly convex and $L$-smooth loss functions, we proved that local DGT achieves communication complexity $\tilde{\mathcal{O}} \Big(\frac{L}{\mu K} + \frac{\delta}{\mu (1 - \rho)} + \frac{\rho }{(1 - \rho)^2} \cdot \frac{L+ \delta}{\mu}\Big)$, where $\rho$ measures the network …

abstract arxiv communication complexity cs.lg data decentralized gradient math.oc multiple optimization reduce tracking type update updates

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

Data Engineer

@ NTT DATA | Pune, MH, IN