Evolution and Role of Optimizers in Training Deep Learning Models

被引：2

作者：

Wen, XiaoHao ^{[1
]}

Zhou, MengChu ^{[2
,3
]}

机构：

[1] Guangxi Normal Univ, Guilin 541004, Peoples R China

[2] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China

[3] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2024年 / 11卷 / 10期

关键词：

D O I：

10.1109/JAS.2024.124806

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To perform well, deep learning (DL) models have to be trained well. Which optimizer should be adopted? We answer this question by discussing how optimizers have evolved from traditional methods like gradient descent to more advanced techniques to address challenges posed by high-dimensional and non-convex problem space. Ongoing challenges include their hyperparameter sensitivity, balancing between convergence and generalization performance, and improving interpretability of optimization processes. Researchers continue to seek robust, efficient, and universally applicable optimizers to advance the field of DL across various domains.

引用

页码：2039 / 2042

页数：4

共 50 条

[21] Understanding Training Efficiency of Deep Learning Recommendation Models at Scale
Acun, Bilge
Murphy, Matthew
Wang, Xiaodong
Nie, Jade
Wu, Carole-Jean
Hazelwood, Kim
2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 802 - 814
[22] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[23] Post-training Quantization Methods for Deep Learning Models
Kluska, Piotr
Zieba, Maciej
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
[24] Making It Simple? Training Deep Learning Models Toward Simplicity
Repetto, Marco
La Torre, Davide
2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 784 - 789
[25] Analysis of Training Deep Learning Models for PCB Defect Detection
Park, Joon-Hyung
Kim, Yeong-Seok
Seo, Hwi
Cho, Yeong-Jun
SENSORS, 2023, 23 (05)
[26] Descending through a Crowded Valley-Benchmarking Deep Learning Optimizers
Schmidt, Robin M.
Schneider, Frank
Hennig, Philipp
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[27] How deep can we decipher protein evolution with deep learning models
Fu, Xiaozhi
PATTERNS, 2024, 5 (08):
[28] Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Park, Haekyu
Lee, Seongmin
Hoover, Benjamin
Wright, Austin P.
Shaikh, Omar
Duggal, Rahul
Das, Nilaksh
Li, Kevin
Hoffman, Judy
Chau, Duen Horng
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 2044 - 2054
[29] Distributed Framework for Accelerating Training of Deep Learning Models through Prioritization
Zhou, Tian
Gao, Lixin
2021 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E 2021, 2021, : 201 - 209
[30] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
Teng, Yunfei
Gao, Wenbo
Chalus, Francois
Choromanska, Anna
Goldfarb, Donald
Weller, Adrian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →