A Comparative Study of Recently Deep Learning Optimizers

被引：1

作者：

Liu, Yan ^{[1
]}

Zhang, Maojun ^{[1
]}

Zhong, Zhiwei ^{[1
]}

Zeng, Xiangrong ^{[1
]}

Long, Xin ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China

来源：

INTERNATIONAL CONFERENCE ON ALGORITHMS, HIGH PERFORMANCE COMPUTING, AND ARTIFICIAL INTELLIGENCE (AHPCAI 2021) | 2021年 / 12156卷

关键词：

optimizers; deep learning; hessian matrix; proxy algorithm;

D O I：

10.1117/12.2626430

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning has achieved great success in computer vision, natural language processing, recommendation systems and other fields. However, the models of deep neural network (DNN) are very complex, which often contain millions of parameters and tens or even hundreds of layers. Optimizing weights of DNNs is easy to fall into local optima, and hard to achieve better performance. Thus, how to choose an effective optimizer which is able to obtain network with higher precision and stronger generalization ability is of great significance. In this article, we make a review of some popular historical and state-of-the-art optimizers, and conclude them into three main streams: first order optimizers that accelerate convergence speed of stochastic gradient descent or/and adaptively adjust learning rates; second order optimizers that can make use of second-order information of loss landscape which helps escape from local optima; proxy optimizers that are able to deal with non-differentiable loss functions through combining with the proxy algorithm. We also summarize the first and second order moment used in different optimizers. Moreover, we provide an insightful comparison on some optimizers through image classification. The results show that first order optimizers like AdaMod and Ranger not only have low computational cost, but also show great convergence speed. Meanwhile, the optimizers that can introduce curvature information such as Adabelief and Apollo, have a better generalization especially when optimizing complex network.

引用

页数：9

共 50 条

[1] Plant Disease Classification: A Comparative Evaluation of Convolutional Neural Networks and Deep Learning Optimizers
Saleem, Muhammad Hammad
Potgieter, Johan
Arif, Khalid Mahmood
PLANTS-BASEL, 2020, 9 (10): : 1 - 17
[2] Experimental Comparison of Stochastic Optimizers in Deep Learning
Okewu, Emmanuel
Adewole, Philip
Sennaike, Oladipupo
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V, 2019, 11623 : 704 - 715
[3] NeuroEvoBench: Benchmarking Evolutionary Optimizers for Deep Learning Applications
Lange, Robert Tjarko
Tang, Yujin
Tian, Yingtao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[4] Evolution and Role of Optimizers in Training Deep Learning Models
XiaoHao Wen
MengChu Zhou
IEEE/CAA Journal of Automatica Sinica, 2024, 11 (10) : 2039 - 2042
[5] Evolution and Role of Optimizers in Training Deep Learning Models
Wen, XiaoHao
Zhou, MengChu
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (10) : 2039 - 2042
[6] A Comparative Study on Classification by Deep Learning
Caliskan, Abdullah
Badem, Hasan
Basturk, Alper
Yuksel, Mehmet Emin
2016 NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND BIOMEDICAL ENGINEERING (ELECO), 2016, : 503 - 506
[7] A comparative evaluation of convolutional neural networks, training image sizes, and deep learning optimizers for weed detection in alfalfa
Yang, Jie
Bagavathiannan, Muthukumar
Wang, Yundi
Chen, Yong
Yu, Jialin
WEED TECHNOLOGY, 2022, 36 (04) : 512 - 522
[8] A COMPARATIVE STUDY OF NEIGHBORHOOD TOPOLOGIES FOR PARTICLE SWARM OPTIMIZERS
Reyes Medina, Angelina Jane
Toscano Pulido, Gregorio
Ramirez Torres, Jose Gabriel
IJCCI 2009: PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2009, : 152 - 159
[9] Descending through a Crowded Valley-Benchmarking Deep Learning Optimizers
Schmidt, Robin M.
Schneider, Frank
Hennig, Philipp
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[10] Comparative Study on Crowd Counting with Deep Learning
Shabbir, Uzair
Sang, Jun
Alam, Mohammad S.
Tan, Jinghan
Xia, Xiaofeng
PATTERN RECOGNITION AND TRACKING XXXI, 2020, 11400

← 1 2 3 4 5 →