Accelerating Neural Network Training: A Brief Review

被引:3
|
作者
Nokhwal, Sahil [1 ]
Chilakalapudi, Priyanka [1 ]
Donekal, Preeti [1 ]
Nokhwal, Suman [2 ]
Pahune, Saurabh [3 ]
Chaudhary, Ankit [4 ]
机构
[1] Univ Memphis, Memphis, TN 38152 USA
[2] Intercontinental Exchange Inc, Pleasanton, CA USA
[3] Cardinal Hlth, Dublin, OH USA
[4] Jawaharlal Nehru Univ, New Delhi, India
关键词
Neural Network Training; Acceleration Techniques; Training Optimization; Deep Learning Speedup; Model Training Efficiency; Machine Learning Accelerators; Training Time Reduction; Optimization Strategies;
D O I
10.1145/3665065.3665071
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The process of training a deep neural network is characterized by significant time requirements and associated costs. Although researchers have made considerable progress in this area, further work is still required due to resource constraints. This study examines innovative approaches to expedite the training process of deep neural networks (DNN), with specific emphasis on three state-of-the-art models such as ResNet50, Vision Transformer (ViT), and EfficientNet. The research utilizes sophisticated methodologies, including Gradient Accumulation (GA), Automatic Mixed Precision (AMP), and Pin Memory (PM), in order to optimize performance and accelerate the training procedure. The study examines the effects of these methodologies on the DNN models discussed earlier, assessing their efficacy with regard to training rate and computational efficacy. The study showcases the efficacy of including GA as a strategic approach, resulting in a noteworthy decrease in the duration required for training. This enables the models to converge at a faster pace. The utilization of AMP enhances the speed of computations by taking advantage of the advantages offered by lower precision arithmetic while maintaining the correctness of the model. Furthermore, this study investigates the application of Pin Memory as a strategy to enhance the efficiency of data transmission between the central processing unit and the graphics processing unit, thereby offering a promising opportunity for enhancing overall performance. The experimental findings demonstrate that the combination of these sophisticated methodologies significantly accelerates the training of DNNs, offering vital insights for experts seeking to improve the effectiveness of deep learning processes.
引用
收藏
页码:31 / 35
页数:5
相关论文
共 50 条
  • [1] FPRaker: A Processing Element For Accelerating Neural Network Training
    Awad, Omar Mohamed
    Mahmoud, Mostafa
    Edo, Isak
    Zadeh, Ali Hadi
    Bannon, Ciaran
    Jayarajan, Anand
    Pekhimenko, Gennady
    Moshovos, Andreas
    PROCEEDINGS OF 54TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2021, 2021, : 857 - 869
  • [2] Accelerating neural network training using weight extrapolations
    Kamarthi, SV
    Pittner, S
    NEURAL NETWORKS, 1999, 12 (09) : 1285 - 1299
  • [3] Accelerating Data Loading in Deep Neural Network Training
    Yang, Chih-Chieh
    Cong, Guojing
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 235 - 245
  • [4] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
    Cai, Tianle
    Luo, Shengjie
    Xu, Keyulu
    He, Di
    Liu, Tie-Yan
    Wang, Liwei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [5] fuseGNN: Accelerating Graph Convolutional Neural Network Training on GPGPU
    Chen, Zhaodong
    Yan, Mingyu
    Zhu, Maohua
    Deng, Lei
    Li, Guoqi
    Li, Shuangchen
    Xie, Yuan
    2020 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED-DESIGN (ICCAD), 2020,
  • [6] Accelerating Neural Network Training with Processing-in-Memory GPU
    Fei, Xiang
    Han, Jianhui
    Huang, Jianqiang
    Zheng, Weimin
    Zhang, Youhui
    2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 414 - 421
  • [7] Accelerating deep neural network training for action recognition on a cluster of GPUs
    Cong, Guojing
    Domeniconi, Giacomo
    Shapiro, Joshua
    Zhou, Fan
    Chen, Barry
    2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 298 - 305
  • [8] Accelerating neural network training with distributed asynchronous and selective optimization (DASO)
    Coquelin, Daniel
    Debus, Charlotte
    Goetz, Markus
    von der Lehr, Fabrice
    Kahn, James
    Siggel, Martin
    Streit, Achim
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [9] Accelerating convolutional neural network training using ProMoD backpropagation algorithm
    Gurhanli, Ahmet
    IET IMAGE PROCESSING, 2020, 14 (13) : 2957 - 2964
  • [10] Accelerating deep neural network training with inconsistent stochastic gradient descent
    Wang, Linnan
    Yang, Yi
    Min, Renqiang
    Chakradhar, Srimat
    NEURAL NETWORKS, 2017, 93 : 219 - 229