Performance and Consistency Analysis for Distributed Deep Learning Applications

被引:0
|
作者
Jia, Danlin [1 ]
Saha, Manoj Pravakar [2 ]
Bhimani, Janki [2 ]
Mi, Ningfang [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Florida Int Univ, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/IPCCC50635.2020.9391566
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Accelerating the training of Deep Neural Network (DNN) models is very important for successfully using deep learning techniques in fields like computer vision and speech recognition. Distributed frameworks help to speed up the training process for large DNN models and datasets. Plenty of works have been done to improve model accuracy and training efficiency, based on mathematical analysis of computations in the Convolutional Neural Networks (CNN). However, to run distributed deep learning applications in the real world, users and developers need to consider the impacts of system resource distribution. In this work, we deploy a real distributed deep learning cluster with multiple virtual machines. We conduct an in-depth analysis to understand the impacts of system configurations, distribution typologies, and application parameters, on the latency and correctness of the distributed deep learning applications. We analyze the performance diversity under different model consistency and data parallelism by profiling run-time system utilization and tracking application activities. Based on our observations and analysis, we develop design guidelines for accelerating distributed deep-learning training on virtualized environments.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches
    Zhang, Yuhao
    McQuillan, Frank
    Jayaram, Nandish
    Kak, Nikhil
    Khanna, Ekta
    Kislal, Orhan
    Valdano, Domino
    Kumar, Arun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (10): : 1769 - 1782
  • [42] Split-Et-Impera: A Framework for the Design of Distributed Deep Learning Applications
    Capogrosso, Luigi
    Cunico, Federico
    Lora, Michele
    Cristani, Marco
    Fummi, Franco
    Quaglia, Davide
    2023 26TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, DDECS, 2023, : 39 - 44
  • [43] Deep Learning-Based Code Auto-Completion for Distributed Applications
    Alizadehsani, Zakieh
    Pinto-Santos, Francisco
    Alonso-Moro, David
    Berrocal Macias, David
    Gonzalez-Briones, Alfonso
    19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2023, 583 : 131 - 143
  • [44] An efficient, distributed stochastic gradient descent algorithm for deep-learning applications
    Cong, Guojing
    Bhardwaj, Onkar
    Feng, Minwei
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 11 - 20
  • [45] Peta-Scale Embedded Photonics Architecture for Distributed Deep Learning Applications
    Wu, Zhenguo
    Dai, Liang Yuan
    Novick, Asher
    Glick, Madeleine
    Zhu, Ziyi
    Rumley, Sebastien
    Michelogiannakis, George
    Shalf, John
    Bergman, Keren
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2023, 41 (12) : 3737 - 3749
  • [46] Performance of Deep Learning Pickers in Routine Network Processing Applications
    Enrique Garcia, Jose
    Fernandez-Prieto, Luis M.
    Villasenor, Antonio
    Sanz, Veronica
    Ammirati, Jean-Baptiste
    Diaz Suarez, Eduardo A.
    Garcia, Carmen
    SEISMOLOGICAL RESEARCH LETTERS, 2022, 93 (05) : 2529 - 2542
  • [47] Performance Prediction of GPU-based Deep Learning Applications
    Gianniti, Eugenio
    Zhang, Li
    Ardagna, Danilo
    CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2019, : 279 - 286
  • [48] Performance Prediction of GPU-based Deep Learning Applications
    Gianniti, Eugenio
    Zhang, Li
    Ardagna, Danilo
    2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 167 - 170
  • [49] DEEP UNSUPERVISED HASHING WITH SEMANTIC CONSISTENCY LEARNING
    Zhao, Chuang
    Lu, Shijie
    Ling, Hefei
    Shi, Yuxuan
    Gu, Bo
    Li, Ping
    Cao, Qiang
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1380 - 1384
  • [50] Deep Learning for Photonic Design and Analysis: Principles and Applications
    Duan, Bing
    Wu, Bei
    Chen, Jin-hui
    Chen, Huanyang
    Yang, Da-Quan
    FRONTIERS IN MATERIALS, 2022, 8