Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback

被引:0
|
作者
Zheng, Shuai [1 ,2 ]
Huang, Ziyue [1 ]
Kwok, James T. [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Amazon Web Serv, Seattle, WA 98109 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Communication overhead is a major bottleneck hampering the scalability of distributed machine learning systems. Recently, there has been a surge of interest in using gradient compression to improve the communication efficiency of distributed neural network training. Using 1-bit quantization, signSGD with majority vote achieves a 32x reduction on communication cost. However, its convergence is based on unrealistic assumptions and can diverge in practice. In this paper, we propose a general distributed compressed SGD with Nesterov's momentum. We consider two-way compression, which compresses the gradients both to and from workers. Convergence analysis on nonconvex problems for general gradient compressors is provided. By partitioning the gradient into blocks, a blockwise compressor is introduced such that each gradient block is compressed and transmitted in 1-bit format with a scaling factor, leading to a nearly 32x reduction on communication. Experimental results show that the proposed method converges as fast as full-precision distributed momentum SGD and achieves the same testing accuracy. In particular, on distributed ResNet training with 7 workers on the ImageNet, the proposed algorithm achieves the same testing accuracy as momentum SGD using full-precision gradients, but with 46% less wall clock time.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Efficient implementation of error-feedback LSL algorithm
    Miranda, MD
    Gerken, M
    da Silva, MTM
    ELECTRONICS LETTERS, 1999, 35 (16) : 1308 - 1309
  • [22] Communication-Efficient Vertical Federated Learning via Compressed Error Feedback
    Valdeira, Pedro
    Xavier, Joao
    Soares, Claudia
    Chi, Yuejie
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2025, 73 : 1065 - 1080
  • [23] Communication-efficient Vertical Federated Learning via Compressed Error Feedback
    Valdeira, Pedro
    Xavier, Joao
    Soares, Claudia
    Chi, Yuejie
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 1037 - 1041
  • [24] Communication-Efficient Nonconvex Federated Learning With Error Feedback for Uplink and Downlink
    Zhou, Xingcai
    Chang, Le
    Cao, Jinde
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1003 - 1014
  • [25] Communication-Efficient Decentralized Local SGD over Undirected Networks
    Qin, Tiancheng
    Etesami, S. Rasoul
    Uribe, Cesar A.
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 3361 - 3366
  • [26] Communication-Efficient Quantized SGD for Learning Polynomial Neural Network
    Yang, Zhanpeng
    Zhou, Yong
    Wu, Youlong
    Shi, Yuanming
    2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
  • [27] QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding
    Alistarh, Dan
    Grubic, Demjan
    Li, Jerry Z.
    Tomioka, Ryota
    Vojnovic, Milan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [28] Communication-efficient Federated Learning via Quantized Clipped SGD
    Jia, Ninghui
    Qu, Zhihao
    Ye, Baoliu
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 559 - 571
  • [29] COMMUNICATION-EFFICIENT DISTRIBUTED MAX-VAR GENERALIZED CCA VIA ERROR FEEDBACK-ASSISTED QUANTIZATION
    Shrestha, Sagar
    Fu, Xiao
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9052 - 9056
  • [30] Adaptive Top-K in SGD for Communication-Efficient Distributed Learning in Multi-Robot Collaboration
    Ruan, Mengzhe
    Yan, Guangfeng
    Xiao, Yuanzhang
    Song, Linqi
    Xu, Weitao
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (03) : 487 - 501