Distributed Stochastic Gradient Descent Using LDGM Codes

被引：0

作者：

Horii, Shunsuke ^{[1
]}

Yoshida, Takahiro ^{[2
]}

Kobayashi, Manabu ^{[1
]}

Matsushima, Toshiyasu ^{[1
]}

机构：

[1] Waseda Univ, Tokyo, Japan

[2] Yokohama Coll Commerce, Yokohama, Kanagawa, Japan

来源：

2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT) | 2019年

基金：

日本学术振兴会;

关键词：

D O I：

10.1109/isit.2019.8849580

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider a distributed learning problem in which the computation is carried out on a system consisting of a master node and multiple worker nodes. In such systems, the existence of slow-running machines called stragglers will cause a significant decrease in performance. Recently, coding theoretic framework, which is named Gradient Coding (GC), for mitigating stragglers in distributed learning has been established by Tandon et al. Most studies on GC are aiming at recovering the gradient information completely assuming that the Gradient Descent (GD) algorithm is used as a learning algorithm. On the other hand, if the Stochastic Gradient Descent (SGD) algorithm is used, it is not necessary to completely recover the gradient information, and its unbiased estimator is sufficient for the learning. In this paper, we propose a distributed SGD scheme using Low Density Generator Matrix (LDGM) codes. In the proposed system, it may take longer time than existing GC methods to recover the gradient information completely, however, it enables the master node to obtain a high-quality unbiased estimator of the gradient at low computational cost and it leads to overall performance improvement.

引用

页码：1417 / 1421

页数：5

共 50 条

[41] Stochastic gradient descent tricks
Bottou, Léon
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
[42] Byzantine Stochastic Gradient Descent
Alistarh, Dan
Allen-Zhu, Zeyuan
Li, Jerry
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[43] An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent
Tu, Jun
Zhou, Jia
Ren, Donglin
COMPUTER COMMUNICATIONS, 2022, 195 : 416 - 423
[44] CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation
Yu, Enda
Dong, Dezun
Xu, Yemao
Ouyang, Shuo
Liao, Xiangke
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
[45] CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation
Yu, Enda
Dong, Dezun
Xu, Yemao
Ouyang, Shuo
Liao, Xiangke
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2022, 169 : 42 - 57
[46] POSTER: Accelerating Distributed Stochastic Gradient Descent with Adaptive Periodic Parameter Averaging
Jiang, Peng
Agrawal, Gagan
PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 403 - 404
[47] A Communication-Efficient Stochastic Gradient Descent Algorithm for Distributed Nonconvex Optimization
Xie, Antai
Yi, Xinlei
Wang, Xiaofan
Cao, Ming
Ren, Xiaoqiang
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, : 609 - 614
[48] An efficient, distributed stochastic gradient descent algorithm for deep-learning applications
Cong, Guojing
Bhardwaj, Onkar
Feng, Minwei
2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 11 - 20
[49] Robust Pose Graph Optimization Using Stochastic Gradient Descent
Wang, John
Olson, Edwin
2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 4284 - 4289
[50] Featured Hybrid Recommendation System Using Stochastic Gradient Descent
Si Thin Nguyen
Hyun Young Kwak
Si Young Lee
Gwang Yong Gim
International Journal of Networked and Distributed Computing, 2021, 9 : 25 - 32

← 1 2 3 4 5 →