Evolution and Role of Optimizers in Training Deep Learning Models

被引：2

作者：

Wen, XiaoHao ^{[1
]}

Zhou, MengChu ^{[2
,3
]}

机构：

[1] Guangxi Normal Univ, Guilin 541004, Peoples R China

[2] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China

[3] New Jersey Inst Technol, Helen & John C Hartmann Dept Elect & Comp Engn, Newark, NJ 07102 USA

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2024年 / 11卷 / 10期

关键词：

D O I：

10.1109/JAS.2024.124806

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To perform well, deep learning (DL) models have to be trained well. Which optimizer should be adopted? We answer this question by discussing how optimizers have evolved from traditional methods like gradient descent to more advanced techniques to address challenges posed by high-dimensional and non-convex problem space. Ongoing challenges include their hyperparameter sensitivity, balancing between convergence and generalization performance, and improving interpretability of optimization processes. Researchers continue to seek robust, efficient, and universally applicable optimizers to advance the field of DL across various domains.

引用

页码：2039 / 2042

页数：4

共 50 条

[31] Phase-Change Memory Models for Deep Learning Training and Inference
Nandakumar, S. R.
Boybat, Irem
Joshi, Vinay
Piveteau, Christophe
Le Gallo, Manuel
Rajendran, Bipin
Sebastian, Abu
Eleftheriou, Evangelos
2019 26TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2019, : 727 - 730
[32] Exploration of the Influence on Training Deep Learning Models by Watermarked Image Dataset
Liu, Shiqin
Feng, Shiyuan
Wu, Jinxia
Ren, Wei
Wang, Weiqi
Zheng, Wenwen
19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 421 - 428
[33] Standardizing and Centralizing Datasets for Efficient Training of Agricultural Deep Learning Models
Joshi, Amogh
Guevara, Dario
Earles, Mason
PLANT PHENOMICS, 2023, 5
[34] Training confounder-free deep learning models for medical applications
Zhao, Qingyu
Adeli, Ehsan
Pohl, Kilian M.
NATURE COMMUNICATIONS, 2020, 11 (01)
[35] Efficient Training of Deep Learning Models Through Improved Adaptive Sampling
Avalos-Lopez, Jorge Ivan
Rojas-Dominguez, Alfonso
Ornelas-Rodriguez, Manuel
Carpio, Martin
Valdez, S. Ivvan
PATTERN RECOGNITION (MCPR 2021), 2021, 12725 : 141 - 152
[36] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
Ghadirzadeh, Ali
Poklukar, Petra
Arndt, Karol
Finn, Chelsea
Kyrki, Ville
Kragic, Danica
Björkman, Mårten
Journal of Machine Learning Research, 2022, 23
[37] Training confounder-free deep learning models for medical applications
Qingyu Zhao
Ehsan Adeli
Kilian M. Pohl
Nature Communications, 11
[38] Automated code transformation for distributed training of TensorFlow deep learning models
Sim, Yusung
Shin, Wonho
Lee, Sungho
SCIENCE OF COMPUTER PROGRAMMING, 2025, 242
[39] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
Ghadirzadeh, Ali
Poklukar, Petra
Arndt, Karol
Finn, Chelsea
Kyrki, Ville
Kragic, Danica
Bjorkman, Marten
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[40] DATA AUGMENTATION IN TRAINING DEEP LEARNING MODELS FOR MALWARE FAMILY CLASSIFICATION
Ding Yuxin
Wang Guangbin
Ma Yubin
Ding Haoxuan
PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2021, : 102 - 107

← 1 2 3 4 5 →