Distributed Asynchronous Optimization of Convolutional Neural Networks

被引:0
|
作者
Chan, William [1 ]
Lane, Ian [1 ,2 ]
机构
[1] Carnegie Mellon Univ, Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
deep neural network; distributed optimization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, deep Convolutional Neural Networks have been shown to outperform Deep Neural Networks for acoustic modelling, producing state-of-the-art accuracy in speech recognition tasks. Convolutional models provide increased model robustness through the usage of pooling invariance and weight sharing across spectrum and time. However, training convolutional models is a very computationally expensive optimization procedure, especially when combined with large training corpora. In this paper, we present a novel algorithm for scalable training of deep Convolutional Neural Networks across multiple GPUs. Our distributed asynchronous stochastic gradient descent algorithm incorporates sparse gradients, momentum and gradient decay to accelerate the training of these networks. Our approach is stable, neither requiring warm-starting or excessively large minibatches. Our proposed approach enables convolutional models to be efficiently trained across multiple GPUs, enabling a model to be scaled asynchronously across 5 GPU workers with 68% efficiency.
引用
收藏
页码:1073 / 1077
页数:5
相关论文
共 50 条
  • [21] Convolutional Neural Networks for estimating spatially-distributed evapotranspiration
    Garcia-Pedrero, Angel
    Gonzalo-Martin, Consuelo
    Lillo-Saavedra, Mario F.
    Rodriguez-Esparragon, Dionisio
    Menasalvas, Ernestina
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXIII, 2017, 10427
  • [22] Accelerating neural network training with distributed asynchronous and selective optimization (DASO)
    Coquelin, Daniel
    Debus, Charlotte
    Goetz, Markus
    von der Lehr, Fabrice
    Kahn, James
    Siggel, Martin
    Streit, Achim
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [23] Evaluation of MPI Allreduce for Distributed Training of Convolutional Neural Networks
    Castello, Adrian
    Catalan, Mar
    Dolz, Manuel F.
    Mestre, Jose, I
    Quintana-Orti, Enrique S.
    Duato, Jose
    2021 29TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2021), 2021, : 109 - 116
  • [24] Accelerating neural network training with distributed asynchronous and selective optimization (DASO)
    Daniel Coquelin
    Charlotte Debus
    Markus Götz
    Fabrice von der Lehr
    James Kahn
    Martin Siggel
    Achim Streit
    Journal of Big Data, 9
  • [25] ASYNCHRONOUS STOCHASTIC OPTIMIZATION FOR SEQUENCE TRAINING OF DEEP NEURAL NETWORKS
    Heigold, Georg
    McDermott, Erik
    Vanhoucke, Vincent
    Senior, Andrew
    Bacchiani, Michiel
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [26] Hyperparameter Optimization for Convolutional Neural Networks with Genetic Algorithms and Bayesian Optimization
    Puentes G, David E.
    Barrios H, Carlos J.
    Navaux, Philippe O. A.
    2022 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2022, : 131 - 135
  • [27] Efficient Optimization of Convolutional Neural Networks using Particle Swarm Optimization
    Yamasaki, Toshihiko
    Honma, Takuto
    Aizawa, Kiyoharu
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 70 - 73
  • [28] Asynchronous Distributed Joint Optimization in Wireless Multi-Hop Networks
    Liu, Jain-Shing
    Lin, Chun-Hung Richard
    IEEE COMMUNICATIONS LETTERS, 2015, 19 (09) : 1620 - 1623
  • [29] Fully asynchronous distributed optimization with linear convergence over directed networks
    SHA Xingyu
    ZHANG Jiaqi
    YOU Keyou
    中山大学学报(自然科学版)(中英文), 2023, 62 (05) : 1 - 23
  • [30] Atomic Layer Deposition Optimization Using Convolutional Neural Networks
    Cagnazzo, Julian
    Abuomar, Osama
    Yanguas-Gil, Angel
    Elam, Jeffrey W.
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 228 - 232