A Comparative Study of Recently Deep Learning Optimizers

被引:1
|
作者
Liu, Yan [1 ]
Zhang, Maojun [1 ]
Zhong, Zhiwei [1 ]
Zeng, Xiangrong [1 ]
Long, Xin [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China
关键词
optimizers; deep learning; hessian matrix; proxy algorithm;
D O I
10.1117/12.2626430
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has achieved great success in computer vision, natural language processing, recommendation systems and other fields. However, the models of deep neural network (DNN) are very complex, which often contain millions of parameters and tens or even hundreds of layers. Optimizing weights of DNNs is easy to fall into local optima, and hard to achieve better performance. Thus, how to choose an effective optimizer which is able to obtain network with higher precision and stronger generalization ability is of great significance. In this article, we make a review of some popular historical and state-of-the-art optimizers, and conclude them into three main streams: first order optimizers that accelerate convergence speed of stochastic gradient descent or/and adaptively adjust learning rates; second order optimizers that can make use of second-order information of loss landscape which helps escape from local optima; proxy optimizers that are able to deal with non-differentiable loss functions through combining with the proxy algorithm. We also summarize the first and second order moment used in different optimizers. Moreover, we provide an insightful comparison on some optimizers through image classification. The results show that first order optimizers like AdaMod and Ranger not only have low computational cost, but also show great convergence speed. Meanwhile, the optimizers that can introduce curvature information such as Adabelief and Apollo, have a better generalization especially when optimizing complex network.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Plant Disease Classification: A Comparative Evaluation of Convolutional Neural Networks and Deep Learning Optimizers
    Saleem, Muhammad Hammad
    Potgieter, Johan
    Arif, Khalid Mahmood
    PLANTS-BASEL, 2020, 9 (10): : 1 - 17
  • [2] Experimental Comparison of Stochastic Optimizers in Deep Learning
    Okewu, Emmanuel
    Adewole, Philip
    Sennaike, Oladipupo
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V, 2019, 11623 : 704 - 715
  • [3] NeuroEvoBench: Benchmarking Evolutionary Optimizers for Deep Learning Applications
    Lange, Robert Tjarko
    Tang, Yujin
    Tian, Yingtao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Evolution and Role of Optimizers in Training Deep Learning Models
    XiaoHao Wen
    MengChu Zhou
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (10) : 2039 - 2042
  • [5] Evolution and Role of Optimizers in Training Deep Learning Models
    Wen, XiaoHao
    Zhou, MengChu
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (10) : 2039 - 2042
  • [6] A Comparative Study on Classification by Deep Learning
    Caliskan, Abdullah
    Badem, Hasan
    Basturk, Alper
    Yuksel, Mehmet Emin
    2016 NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND BIOMEDICAL ENGINEERING (ELECO), 2016, : 503 - 506
  • [7] A comparative evaluation of convolutional neural networks, training image sizes, and deep learning optimizers for weed detection in alfalfa
    Yang, Jie
    Bagavathiannan, Muthukumar
    Wang, Yundi
    Chen, Yong
    Yu, Jialin
    WEED TECHNOLOGY, 2022, 36 (04) : 512 - 522
  • [8] A COMPARATIVE STUDY OF NEIGHBORHOOD TOPOLOGIES FOR PARTICLE SWARM OPTIMIZERS
    Reyes Medina, Angelina Jane
    Toscano Pulido, Gregorio
    Ramirez Torres, Jose Gabriel
    IJCCI 2009: PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2009, : 152 - 159
  • [9] Descending through a Crowded Valley-Benchmarking Deep Learning Optimizers
    Schmidt, Robin M.
    Schneider, Frank
    Hennig, Philipp
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [10] Comparative Study on Crowd Counting with Deep Learning
    Shabbir, Uzair
    Sang, Jun
    Alam, Mohammad S.
    Tan, Jinghan
    Xia, Xiaofeng
    PATTERN RECOGNITION AND TRACKING XXXI, 2020, 11400