all AI news
Near-Optimal Sparse Allreduce for Distributed Deep Learning. (arXiv:2201.07598v1 [cs.DC])
Jan. 20, 2022, 2:10 a.m. | Shigang Li, Torsten Hoefler
cs.LG updates on arXiv.org arxiv.org
Communication overhead is one of the major obstacles to train large deep
learning models at scale. Gradient sparsification is a promising technique to
reduce the communication volume. However, it is very challenging to obtain real
performance improvement because of (1) the difficulty of achieving an scalable
and efficient sparse allreduce algorithm and (2) the sparsification overhead.
This paper proposes O$k$-Top$k$, a scheme for distributed training with sparse
gradients. O$k$-Top$k$ integrates a novel sparse allreduce algorithm (less than
6$k$ communication volume …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ Aviva | UK - Norwich - Carrara - 1st Floor
Werkstudent im Bereich Performance Engineering mit Computer Vision (w/m/div.) - anteilig remote
@ Bosch Group | Stuttgart, Lollar, Germany
Applied Research Scientist - NLP (Senior)
@ Snorkel AI | Hybrid / San Francisco, CA
Associate Principal Engineer, Machine Learning
@ Nagarro | Remote, India