Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training

被引：0

作者：

Lee, Sunwoo ^{[1
]}

Agrawal, Ankit ^{[1
]}

Balaprakash, Prasanna ^{[2
]}

Choudhary, Alok ^{[1
]}

Liao, Wei-keng ^{[1
]}

机构：

[1] Northwestern Univ, EECS Dept, Evanston, IL 60208 USA

[2] Argonne Natl Lab, Lemont, IL USA

来源：

PROCEEDINGS OF 2018 IEEE/ACM MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC 2018) | 2018年

关键词：

Convolutional Neural Network; Deep Learning; Parallelization; Distributed-Memory Parallelization;

D O I：

10.1109/MLHPC.2018.000-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Training Convolutional Neural Network (CNN) models is extremely time-consuming and the efficiency of its parallelization plays a key role in finishing the training in a reasonable amount of time. The well-known synchronous Stochastic Gradient Descent (SGD) algorithm suffers from high costs of inter-process communication and synchronization. To address such problems, asynchronous SGD algorithm employs a master-slave model for parameter update. However, it can result in a poor convergence rate due to the staleness of the gradient. In addition, the master-slave model is not scalable when running on a large number of compute nodes. In this paper, we present a communication-efficient gradient averaging algorithm for synchronous SGD, which adopts a few design strategies to maximize the degree of overlap between computation and communication. The time complexity analysis shows our algorithm outperforms the traditional allreduce-based algorithm. By training the two popular deep CNN models, VGG-16 and ResNet-50, on ImageNet dataset, our experiments performed on Cori Phase-I, a Cray XC40 supercomputer at NERSC show that our algorithm can achieve 2516.36 x speedup for VGG-16 and 2734.25x speedup for ResNet-50 using up to 8192 cores.

引用

页码：47 / 56

页数：10

共 50 条

[41] Gist: Efficient Data Encoding for Deep Neural Network Training
Jain, Animesh
Phanishayee, Amar
Mars, Jason
Tang, Lingjia
Pekhimenko, Gennady
2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 776 - 789
[42] Efficient Deep Neural Network Training Techniques for Overfitting Avoidance
Sabiri, Bihi
EL Asri, Bouchra
Rhanoui, Maryem
ENTERPRISE INFORMATION SYSTEMS, ICEIS 2022, 2023, 487 : 198 - 221
[43] Efficient training for the hybrid optical diffractive deep neural network
Fang, Tao
Lia, Jingwei
Wu, Tongyu
Cheng, Ming
Dong, Xiaowen
AI AND OPTICAL DATA SCIENCES III, 2022, 12019
[44] DGS: Communication-Efficient Graph Sampling for Distributed GNN Training
Wan, Xinchen
Chen, Kai
Zhang, Yiming
2022 IEEE 30TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP 2022), 2022,
[45] Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Mostafa, Hesham
Wang, Xin
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[46] Communication-Efficient Federated DNN Training: Convert, Compress, Correct
Chen, Zhong-Jing
Hernandez, Eduin E.
Huang, Yu-Chih
Rini, Stefano
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (24): : 40431 - 40447
[47] FedBN: A Communication-Efficient Federated Learning Strategy Based on Blockchain
Peng, Henwen
Song, Yingjie
Wang, Qiong
Xiao, Xiong
Tang, Zhuo
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 754 - 759
[48] Communication-Efficient Distributed Deep Metric Learning with Hybrid Synchronization
Su, Yuxin
Lyu, Michael
King, Irwin
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1463 - 1472
[49] Communication-Efficient Learning of Deep Networks from Decentralized Data
McMahan, H. Brendan
Moore, Eider
Ramage, Daniel
Hampson, Seth
Aguera y Arcas, Blaise
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 1273 - 1282
[50] CGX: Adaptive System Support for Communication-Efficient Deep Learning
Markov, Ilia
Ramezanikebrya, Hamidreza
Alistarh, Dan
PROCEEDINGS OF THE TWENTY-THIRD ACM/IFIP INTERNATIONAL MIDDLEWARE CONFERENCE, MIDDLEWARE 2022, 2022, : 241 - 254

← 1 2 3 4 5 →