Benchmarking and Analyzing Deep Neural Network Training

被引:0
|
作者
Zhu, Hongyu [1 ]
Akrout, Mohamed [1 ]
Zheng, Bojian [1 ]
Pelegris, Andrew [1 ]
Jayarajan, Anand [2 ]
Phanishayee, Amar [3 ]
Schroeder, Bianca [1 ]
Pekhimenko, Gennady [1 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Univ British Columbia, Vancouver, BC, Canada
[3] Microsoft Res, Redmond, WA USA
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The recent popularity of deep neural networks (DNNs) has generated considerable research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference - i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation. Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark suite for DNN training, called TBD1, which comprises a representative set of eight DNN models and covers six major machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) performing an extensive performance analysis of these models on three major deep learning frameworks (TensorFlow, MXNet, CNTK) across different hardware configurations (single-GPU, multi-GPU, and multi-machine). We present a new toolchain for performance analysis for these models that combines the targeted usage of existing performance analysis tools, careful selection of performance metrics, and methodologies to analyze the results. We also build a new set of tools for memory profiling in three major frameworks. These tools can shed light on precisely how much memory is consumed by different data structures (weights, activations, gradients, workspace) in DNN training. Using our tools and methodologies, we make several important observations and recommendations on where future DNN training research and optimization should be focused.
引用
收藏
页码:88 / 100
页数:13
相关论文
共 50 条
  • [1] Benchmarking network fabrics for data distributed training of deep neural networks
    Samsi, Siddharth
    Prout, Andrew
    Jones, Michael
    Kirby, Andrew
    Arcand, Bill
    Bergeron, Bill
    Bestor, David
    Byun, Chansup
    Gadepally, Vijay
    Houle, Michael
    Hubbell, Matthew
    Klein, Anna
    Michaleas, Peter
    Milechin, Lauren
    Mullen, Julie
    Rosa, Antonio
    Yee, Charles
    Reuther, Albert
    Kepner, Jeremy
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [2] Visualization in Deep Neural Network Training
    Kollias, Stefanos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2022, 31 (03)
  • [3] BenQ: Benchmarking Automated Quantization on Deep Neural Network Accelerators
    Wei, Zheng
    Zhang, Xingjun
    Li, Jingbo
    Ji, Zeyu
    Wei, Jia
    PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 1479 - 1484
  • [4] RazorNet: Adversarial Training and Noise Training on a Deep Neural Network Fooled by a Shallow Neural Network
    Taheri, Shayan
    Salem, Milad
    Yuan, Jiann-Shiun
    BIG DATA AND COGNITIVE COMPUTING, 2019, 3 (03) : 1 - 17
  • [5] Memory Efficient Deep Neural Network Training
    Shilova, Alena
    EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 2022, 13098 : 515 - 519
  • [6] Deep Neural Network Training with iPSO Algorithm
    Kosten, Mehmet Muzaffer
    Barut, Murat
    Acir, Nurettin
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [7] CLUSTER ADAPTIVE TRAINING FOR DEEP NEURAL NETWORK
    Tan, Tian
    Qian, Yanmin
    Yin, Maofan
    Zhuang, Yimeng
    Yu, Kai
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4325 - 4329
  • [8] Benchmarking deep neural network approaches for Indian Sign Language recognition
    Sharma, Ashish
    Sharma, Nikita
    Saxena, Yatharth
    Singh, Anuraj
    Sadhya, Debanjan
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (12): : 6685 - 6696
  • [9] Benchmarking Deep Neural Network Architectures for Machining Tool Anomaly Detection
    Puranik, Tejas
    Gharbi, Aroua
    Bagdatli, Burak
    Fischer, Olivia Pinon
    Mavris, Dimitri N.
    SMART AND SUSTAINABLE MANUFACTURING SYSTEMS, 2020, 4 (02): : 121 - 145
  • [10] Benchmarking deep neural network approaches for Indian Sign Language recognition
    Ashish Sharma
    Nikita Sharma
    Yatharth Saxena
    Anuraj Singh
    Debanjan Sadhya
    Neural Computing and Applications, 2021, 33 : 6685 - 6696