all AI news
Asynchronous Training Schemes in Distributed Learning with Time Delay. (arXiv:2208.13154v1 [cs.LG])
Aug. 30, 2022, 1:11 a.m. | Haoxiang Wang, Zhanhong Jiang, Chao Liu, Soumik Sarkar, Dongxiang Jiang, Young M. Lee
cs.LG updates on arXiv.org arxiv.org
In the context of distributed deep learning, the issue of stale weights or
gradients could result in poor algorithmic performance. This issue is usually
tackled by delay tolerant algorithms with some mild assumptions on the
objective functions and step sizes. In this paper, we propose a different
approach to develop a new algorithm, called $\textbf{P}$redicting
$\textbf{C}$lipping $\textbf{A}$synchronous $\textbf{S}$tochastic
$\textbf{G}$radient $\textbf{D}$escent (aka, PC-ASGD). Specifically, PC-ASGD
has two steps - the $\textit{predicting step}$ leverages the gradient
prediction using Taylor expansion to reduce …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US