all AI news
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training. (arXiv:2007.11831v2 [cs.LG] UPDATED)
cs.LG updates on arXiv.org arxiv.org
Synchronous strategies with data parallelism, such as the Synchronous
StochasticGradient Descent (S-SGD) and the model averaging methods, are widely
utilizedin distributed training of Deep Neural Networks (DNNs), largely owing
to itseasy implementation yet promising performance. Particularly, each worker
ofthe cluster hosts a copy of the DNN and an evenly divided share of the
datasetwith the fixed mini-batch size, to keep the training of DNNs
convergence. In thestrategies, the workers with different computational
capability, need to wait foreach other because of …
arxiv dbs deep neural network distributed network network training neural network training