Distributed Asynchronous Optimization of Convolutional Neural Networks

被引：0

作者：

Chan, William ^{[1
]}

Lane, Ian ^{[1
,2
]}

机构：

[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA

[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

deep neural network; distributed optimization;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, deep Convolutional Neural Networks have been shown to outperform Deep Neural Networks for acoustic modelling, producing state-of-the-art accuracy in speech recognition tasks. Convolutional models provide increased model robustness through the usage of pooling invariance and weight sharing across spectrum and time. However, training convolutional models is a very computationally expensive optimization procedure, especially when combined with large training corpora. In this paper, we present a novel algorithm for scalable training of deep Convolutional Neural Networks across multiple GPUs. Our distributed asynchronous stochastic gradient descent algorithm incorporates sparse gradients, momentum and gradient decay to accelerate the training of these networks. Our approach is stable, neither requiring warm-starting or excessively large minibatches. Our proposed approach enables convolutional models to be efficiently trained across multiple GPUs, enabling a model to be scaled asynchronously across 5 GPU workers with 68% efficiency.

引用

页码：1073 / 1077

页数：5

共 50 条

[31] Performance Insights of Convolutional Neural Networks Operating on Distributed Computing Platforms
Preeti Chaudhary
Satvik Vats
Vikrant Sharma
SN Computer Science, 6 (4)
[32] Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks
Adrián Castelló
Mar Catalán
Manuel F. Dolz
Enrique S. Quintana-Ortí
José Duato
Computing, 2023, 105 : 1101 - 1119
[33] Hybrid Distributed Cascade Convolutional Neural Networks Model for Riveting Processes
Ortega Sanz, Diego
Gomez Munoz, Carlos Quiterio
Garcia Marquez, Fausto Pedro
PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT - VOL 1, 2022, 144 : 400 - 413
[34] Detecting Anomalous Events on Distributed Systems Using Convolutional Neural Networks
Cheansunan, Purimpat
Phunchongharn, Phond
2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 105 - 109
[35] Convolution-layer parameters optimization in Convolutional Neural Networks
Chegeni, Milad Kohzadi
Rashno, Abdolreza
Fadaei, Sadegh
KNOWLEDGE-BASED SYSTEMS, 2023, 261
[36] Multiagent Reinforcement Learning for Hyperparameter Optimization of Convolutional Neural Networks
Iranfar, Arman
Zapater, Marina
Atienza, David
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 1034 - 1047
[37] Automatically Design Convolutional Neural Networks by Optimization With Submodularity and Supermodularity
Hu, Wenzheng
Jin, Junqi
Liu, Tie-Yan
Zhang, Changshui
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) : 3215 - 3229
[38] Region proposal optimization algorithm based on convolutional neural networks
Wang Chun-zhe
An Jun-she
Jiang Xiu-jie
Xing Xiao-xue
CHINESE OPTICS, 2019, 12 (06) : 1348 - 1361
[39] Optimization of the Convolutional Neural Networks for Automatic Detection of Skin Cancer
Zhang, Long
Gao, Hong Jie
Zhang, Jianhua
Badami, Benjamin
OPEN MEDICINE, 2020, 15 (01): : 27 - 37
[40] Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks
Castello, Adrian
Catalan, Mar
Dolz, Manuel F.
Quintana-Orti, Enrique S.
Duato, Jose
COMPUTING, 2023, 105 (05) : 1101 - 1119

← 1 2 3 4 5 →