vqSGD: Vector Quantized Stochastic Gradient Descent

被引:5
|
作者
Gandikota, Venkata [1 ]
Kane, Daniel [2 ,3 ]
Maity, Raj Kumar [4 ]
Mazumdar, Arya [5 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
[2] Univ Calif San Diego, Dept Comp Sci, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Dept Math, La Jolla, CA 92093 USA
[4] UMass Amherst, Coll Informat & Comp Sci, Amherst, MA 01003 USA
[5] Univ Calif San Diego, Halicioglu Data Sci Inst, La Jolla, CA 92093 USA
关键词
Vector quantization; communication efficiency; mean estimation; stochastic gradient descent (SGD);
D O I
10.1109/TIT.2022.3161620
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: Theta(d/R-2) bits are necessary and sufficient (up to an additive O(log d) term) to describe an unbiased estimator (g) over cap (g) for any g in the d-dimensional unit sphere, under the constraint that parallel to(g) over cap (g)parallel to(2) <= R almost surely, R > 1. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a d-dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require o(d) bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using well-known families of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers automatic privacy guarantees.
引用
收藏
页码:4573 / 4587
页数:15
相关论文
共 50 条
  • [31] A stochastic multiple gradient descent algorithm
    Mercier, Quentin
    Poirion, Fabrice
    Desideri, Jean-Antoine
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 271 (03) : 808 - 817
  • [32] Efficiency Ordering of Stochastic Gradient Descent
    Hu, Jie
    Doshi, Vishwaraj
    Eun, Do Young
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] Stochastic Gradient Descent on Riemannian Manifolds
    Bonnabel, Silvere
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (09) : 2217 - 2229
  • [34] Conjugate directions for stochastic gradient descent
    Schraudolph, NN
    Graepel, T
    ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1351 - 1356
  • [35] STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT
    Gess, Benjamin
    Kassing, Sebastian
    Rana, Nimit
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (06) : 3288 - 3314
  • [36] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
    Archibald, Richard
    Bao, Feng
    Yong, Jiongmin
    EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
  • [37] Stochastic modified equations for the asynchronous stochastic gradient descent
    An, Jing
    Lu, Jianfeng
    Ying, Lexing
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (04) : 851 - 873
  • [38] Quantized Gradient-Descent Algorithm for Distributed Resource Allocation
    Zhou, Hongbing
    Yu, Weiyong
    Yi, Peng
    Hong, Yiguang
    UNMANNED SYSTEMS, 2019, 7 (02) : 119 - 136
  • [39] A modeling method for aero-engine by combining stochastic gradient descent with support vector regression
    Ren, Li-Hua
    Ye, Zhi-Feng
    Zhao, Yong-Ping
    AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 99
  • [40] Stochastic Conjugate Gradient Descent Twin Support Vector Machine for Large Scale Pattern Classification
    Sharma, Sweta
    Rastogi, Reshma
    AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 590 - 602