Bypassing Stationary Points in Training Deep Learning Models

被引:0
|
作者
Jung, Jaeheun [1 ]
Lee, Donghun [2 ]
机构
[1] Korea Univ, Grad Sch Math, Seoul 02841, South Korea
[2] Korea Univ, Dept Math, Seoul 02841, South Korea
基金
新加坡国家研究基金会;
关键词
Training; Pipelines; Neural networks; Deep learning; Computational modeling; Vectors; Classification algorithms; Bypassing; gradient descent; neural network; stationary points;
D O I
10.1109/TNNLS.2024.3411020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gradient-descent-based optimizers are prone to slowdowns in training deep learning models, as stationary points are ubiquitous in the loss landscape of most neural networks. We present an intuitive concept of bypassing the stationary points and realize the concept into a novel method designed to actively rescue optimizers from slowdowns encountered in neural network training. The method, bypass pipeline, revitalizes the optimizer by extending the model space and later contracts the model back to its original space with function-preserving algebraic constraints. We implement the method into the bypass algorithm, verify that the algorithm shows theoretically expected behaviors of bypassing, and demonstrate its empirical benefit in regression and classification benchmarks. Bypass algorithm is highly practical, as it is computationally efficient and compatible with other improvements of first-order optimizers. In addition, bypassing for neural networks leads to new theoretical research such as model-specific bypassing and neural architecture search (NAS).
引用
收藏
页码:18859 / 18871
页数:13
相关论文
共 50 条
  • [31] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
    Ghadirzadeh, Ali
    Poklukar, Petra
    Arndt, Karol
    Finn, Chelsea
    Kyrki, Ville
    Kragic, Danica
    Björkman, Mårten
    Journal of Machine Learning Research, 2022, 23
  • [32] Training confounder-free deep learning models for medical applications
    Qingyu Zhao
    Ehsan Adeli
    Kilian M. Pohl
    Nature Communications, 11
  • [33] Automated code transformation for distributed training of TensorFlow deep learning models
    Sim, Yusung
    Shin, Wonho
    Lee, Sungho
    SCIENCE OF COMPUTER PROGRAMMING, 2025, 242
  • [34] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
    Ghadirzadeh, Ali
    Poklukar, Petra
    Arndt, Karol
    Finn, Chelsea
    Kyrki, Ville
    Kragic, Danica
    Bjorkman, Marten
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [35] DATA AUGMENTATION IN TRAINING DEEP LEARNING MODELS FOR MALWARE FAMILY CLASSIFICATION
    Ding Yuxin
    Wang Guangbin
    Ma Yubin
    Ding Haoxuan
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 102 - 107
  • [36] Performance Analysis and Characterization of Training Deep Learning Models on Mobile Device
    Liu, Jie
    Liu, Jiawen
    Du, Wan
    Li, Dong
    2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2019, : 506 - 515
  • [37] Distributed Training for Deep Learning Models On An Edge Computing Network Using Shielded Reinforcement Learning
    Sen, Tanmoy
    Shen, Haiying
    2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 581 - 591
  • [38] Carbon Footprint of Selecting and Training Deep Learning Models for Medical Image Analysis
    Selvan, Raghavendra
    Bhagwat, Nikhil
    Anthony, Lasse F. Wolff
    Kanding, Benjamin
    Dam, Erik B.
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 506 - 516
  • [39] A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
    Suh, Namjoon
    Cheng, Guang
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2025, 12 : 177 - 207
  • [40] Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models
    Prashanthi S.K.
    Kesanapalli S.A.
    Simmhan Y.
    Performance Evaluation Review, 2023, 51 (01): : 37 - 38