Distributed signSGD with improved accuracy and network-fault tolerance

T Le Phong, TT Phuong - IEEE Access, 2020 - ieeexplore.ieee.org
IEEE Access, 2020ieeexplore.ieee.org
This paper proposes DropSignSGD, a communication-efficient and network-fault tolerant
algorithm for training deep neural networks in a distributed and synchronous fashion. In
DropSignSGD, all numerical elements communicated between machines are either 1 or-1,
represented by only one bit. More importantly, DropSignSGD does not decline the
benchmark accuracy on the ImageNet dataset when compared with the traditional
distributed stochastic gradient descent algorithm, owing to a little trick in memorizing unused …
This paper proposes DropSignSGD, a communication-efficient and network-fault tolerant algorithm for training deep neural networks in a distributed and synchronous fashion. In DropSignSGD, all numerical elements communicated between machines are either 1 or -1, represented by only one bit. More importantly, DropSignSGD does not decline the benchmark accuracy on the ImageNet dataset when compared with the traditional distributed stochastic gradient descent algorithm, owing to a little trick in memorizing unused gradients. Experimental results are supported by a mathematical proof showing that DropSignSGD converges under standard assumptions.
ieeexplore.ieee.org
Showing the best result for this search. See all results