all AI news
Topology-aware Generalization of Decentralized SGD. (arXiv:2206.12680v2 [cs.LG] UPDATED)
June 29, 2022, 1:11 a.m. | Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, Dacheng Tao
cs.LG updates on arXiv.org arxiv.org
This paper studies the algorithmic stability and generalizability of
decentralized stochastic gradient descent (D-SGD). We prove that the consensus
model learned by D-SGD is $\mathcal{O}{(m/N+1/m+\lambda^2)}$-stable in
expectation in the non-convex non-smooth setting, where $N$ is the total sample
size of the whole system, $m$ is the worker number, and $1-\lambda$ is the
spectral gap that measures the connectivity of the communication topology.
These results then deliver an
$\mathcal{O}{(1/N+{({(m^{-1}\lambda^2)}^{\frac{\alpha}{2}}+
m^{-\alpha})}/{N^{1-\frac{\alpha}{2}}})}$ in-average generalization bound,
which is non-vacuous even when $\lambda$ is closed …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst (H/F)
@ Business & Decision | Montpellier, France
Machine Learning Researcher
@ VERSES | Brighton, England, United Kingdom - Remote