A Comparative Study of Recently Deep Learning Optimizers

被引:1
|
作者
Liu, Yan [1 ]
Zhang, Maojun [1 ]
Zhong, Zhiwei [1 ]
Zeng, Xiangrong [1 ]
Long, Xin [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China
关键词
optimizers; deep learning; hessian matrix; proxy algorithm;
D O I
10.1117/12.2626430
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has achieved great success in computer vision, natural language processing, recommendation systems and other fields. However, the models of deep neural network (DNN) are very complex, which often contain millions of parameters and tens or even hundreds of layers. Optimizing weights of DNNs is easy to fall into local optima, and hard to achieve better performance. Thus, how to choose an effective optimizer which is able to obtain network with higher precision and stronger generalization ability is of great significance. In this article, we make a review of some popular historical and state-of-the-art optimizers, and conclude them into three main streams: first order optimizers that accelerate convergence speed of stochastic gradient descent or/and adaptively adjust learning rates; second order optimizers that can make use of second-order information of loss landscape which helps escape from local optima; proxy optimizers that are able to deal with non-differentiable loss functions through combining with the proxy algorithm. We also summarize the first and second order moment used in different optimizers. Moreover, we provide an insightful comparison on some optimizers through image classification. The results show that first order optimizers like AdaMod and Ranger not only have low computational cost, but also show great convergence speed. Meanwhile, the optimizers that can introduce curvature information such as Adabelief and Apollo, have a better generalization especially when optimizing complex network.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Sentiment Analysis Based on Deep Learning: A Comparative Study
    Dang, Nhan Cach
    Moreno-Garcia, Maria N.
    De la Prieta, Fernando
    ELECTRONICS, 2020, 9 (03)
  • [32] Comparative Study of Deep Learning Framework in HPC Environments
    Asaadi, Hamidreza
    Chapman, Barbara
    2017 NEW YORK SCIENTIFIC DATA SUMMIT (NYSDS), 2017,
  • [33] A Comparative Study of Open Source Deep Learning Frameworks
    Shatnawi, Ali
    Al-Bdour, Ghadeer
    Al-Qurran, Raffi
    Al-Ayyoub, Mahmoud
    2018 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2018, : 72 - 77
  • [34] Comparative Study on Deep Learning Frameworks for Object Detection
    Kurian, Elizebeth
    Mathew, Justin
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 79 - 85
  • [35] Comparative Study of Deep Learning Models in Melanoma Detection
    Haghshenas, Farnaz
    Krzyzak, Adam
    Osowski, Stanislaw
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2024, 2024, 15154 : 121 - 131
  • [36] Comparative Study of Distributed Deep Learning Tools on Supercomputers
    Du, Xin
    Kuang, Di
    Ye, Yan
    Li, Xinxin
    Chen, Mengqiang
    Du, Yunfei
    Wu, Weigang
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT I, 2018, 11334 : 122 - 137
  • [37] A Beautiful Image or not: A Comparative Study on Classical Machine Learning and Deep Learning
    Zhang, Ying
    Li, Zhaotong
    Zhao, Qinpei
    Fan, Hongfei
    Rao, Weixiong
    Chen, Jessie
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2018), 2018, : 191 - 197
  • [38] A Comparative Study of Machine Learning and Deep Learning Techniques for Sentiment Analysis
    Jain, Kruttika
    Kaushal, Shivani
    2018 7TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO) (ICRITO), 2018, : 483 - 487
  • [39] COMPARATIVE STUDY OF MACHINE LEARNING AND DEEP LEARNING ALGORITHM FOR FACE RECOGNITION
    Singhal, Nikita
    Ganganwar, Vaishali
    Yadav, Menka
    Chauhan, Asha
    Jakhar, Mahender
    Sharma, Kareena
    JORDANIAN JOURNAL OF COMPUTERS AND INFORMATION TECHNOLOGY, 2021, 7 (03): : 313 - 325
  • [40] A Comparative Study of non-deep Learning, Deep Learning, and Ensemble Learning Methods for Sunspot Number Prediction
    Dang, Yuchen
    Chen, Ziqi
    Li, Heng
    Shu, Hai
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)