Evolution and Role of Optimizers in Training Deep Learning Models

被引:2
|
作者
Wen, XiaoHao [1 ]
Zhou, MengChu [2 ,3 ]
机构
[1] Guangxi Normal Univ, Guilin 541004, Peoples R China
[2] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China
[3] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA
关键词
D O I
10.1109/JAS.2024.124806
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To perform well, deep learning (DL) models have to be trained well. Which optimizer should be adopted? We answer this question by discussing how optimizers have evolved from traditional methods like gradient descent to more advanced techniques to address challenges posed by high-dimensional and non-convex problem space. Ongoing challenges include their hyperparameter sensitivity, balancing between convergence and generalization performance, and improving interpretability of optimization processes. Researchers continue to seek robust, efficient, and universally applicable optimizers to advance the field of DL across various domains.
引用
收藏
页码:2039 / 2042
页数:4
相关论文
共 50 条
  • [21] Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
    Acun, Bilge
    Murphy, Matthew
    Wang, Xiaodong
    Nie, Jade
    Wu, Carole-Jean
    Hazelwood, Kim
    2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 802 - 814
  • [22] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [23] Post-training Quantization Methods for Deep Learning Models
    Kluska, Piotr
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
  • [24] Making It Simple? Training Deep Learning Models Toward Simplicity
    Repetto, Marco
    La Torre, Davide
    2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 784 - 789
  • [25] Analysis of Training Deep Learning Models for PCB Defect Detection
    Park, Joon-Hyung
    Kim, Yeong-Seok
    Seo, Hwi
    Cho, Yeong-Jun
    SENSORS, 2023, 23 (05)
  • [26] Descending through a Crowded Valley-Benchmarking Deep Learning Optimizers
    Schmidt, Robin M.
    Schneider, Frank
    Hennig, Philipp
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [27] How deep can we decipher protein evolution with deep learning models
    Fu, Xiaozhi
    PATTERNS, 2024, 5 (08):
  • [28] Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
    Park, Haekyu
    Lee, Seongmin
    Hoover, Benjamin
    Wright, Austin P.
    Shaikh, Omar
    Duggal, Rahul
    Das, Nilaksh
    Li, Kevin
    Hoffman, Judy
    Chau, Duen Horng
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 2044 - 2054
  • [29] Distributed Framework for Accelerating Training of Deep Learning Models through Prioritization
    Zhou, Tian
    Gao, Lixin
    2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021, 2021, : 201 - 209
  • [30] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
    Teng, Yunfei
    Gao, Wenbo
    Chalus, Francois
    Choromanska, Anna
    Goldfarb, Donald
    Weller, Adrian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32