vqSGD: Vector Quantized Stochastic Gradient Descent

被引：5

作者：

Gandikota, Venkata ^{[1
]}

Kane, Daniel ^{[2
,3
]}

Maity, Raj Kumar ^{[4
]}

Mazumdar, Arya ^{[5
]}

机构：

[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA

[2] Univ Calif San Diego, Dept Comp Sci, La Jolla, CA 92093 USA

[3] Univ Calif San Diego, Dept Math, La Jolla, CA 92093 USA

[4] UMass Amherst, Coll Informat & Comp Sci, Amherst, MA 01003 USA

[5] Univ Calif San Diego, Halicioglu Data Sci Inst, La Jolla, CA 92093 USA

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 2022年 / 68卷 / 07期

关键词：

Vector quantization; communication efficiency; mean estimation; stochastic gradient descent (SGD);

D O I：

10.1109/TIT.2022.3161620

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: Theta(d/R-2) bits are necessary and sufficient (up to an additive O(log d) term) to describe an unbiased estimator (g) over cap (g) for any g in the d-dimensional unit sphere, under the constraint that parallel to(g) over cap (g)parallel to(2) <= R almost surely, R > 1. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a d-dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require o(d) bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using well-known families of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers automatic privacy guarantees.

引用

页码：4573 / 4587

页数：15

共 50 条

[31] A stochastic multiple gradient descent algorithm
Mercier, Quentin
Poirion, Fabrice
Desideri, Jean-Antoine
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 271 (03) : 808 - 817
[32] Efficiency Ordering of Stochastic Gradient Descent
Hu, Jie
Doshi, Vishwaraj
Eun, Do Young
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[33] Stochastic Gradient Descent on Riemannian Manifolds
Bonnabel, Silvere
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (09) : 2217 - 2229
[34] Conjugate directions for stochastic gradient descent
Schraudolph, NN
Graepel, T
ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1351 - 1356
[35] STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT
Gess, Benjamin
Kassing, Sebastian
Rana, Nimit
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (06) : 3288 - 3314
[36] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
Archibald, Richard
Bao, Feng
Yong, Jiongmin
EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
[37] Stochastic modified equations for the asynchronous stochastic gradient descent
An, Jing
Lu, Jianfeng
Ying, Lexing
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (04) : 851 - 873
[38] Quantized Gradient-Descent Algorithm for Distributed Resource Allocation
Zhou, Hongbing
Yu, Weiyong
Yi, Peng
Hong, Yiguang
UNMANNED SYSTEMS, 2019, 7 (02) : 119 - 136
[39] A modeling method for aero-engine by combining stochastic gradient descent with support vector regression
Ren, Li-Hua
Ye, Zhi-Feng
Zhao, Yong-Ping
AEROSPACE SCIENCE AND TECHNOLOGY, 2020, 99
[40] Stochastic Conjugate Gradient Descent Twin Support Vector Machine for Large Scale Pattern Classification
Sharma, Sweta
Rastogi, Reshma
AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 590 - 602

← 1 2 3 4 5 →