A Comparative Study of Recently Deep Learning Optimizers

被引:1
|
作者
Liu, Yan [1 ]
Zhang, Maojun [1 ]
Zhong, Zhiwei [1 ]
Zeng, Xiangrong [1 ]
Long, Xin [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China
关键词
optimizers; deep learning; hessian matrix; proxy algorithm;
D O I
10.1117/12.2626430
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has achieved great success in computer vision, natural language processing, recommendation systems and other fields. However, the models of deep neural network (DNN) are very complex, which often contain millions of parameters and tens or even hundreds of layers. Optimizing weights of DNNs is easy to fall into local optima, and hard to achieve better performance. Thus, how to choose an effective optimizer which is able to obtain network with higher precision and stronger generalization ability is of great significance. In this article, we make a review of some popular historical and state-of-the-art optimizers, and conclude them into three main streams: first order optimizers that accelerate convergence speed of stochastic gradient descent or/and adaptively adjust learning rates; second order optimizers that can make use of second-order information of loss landscape which helps escape from local optima; proxy optimizers that are able to deal with non-differentiable loss functions through combining with the proxy algorithm. We also summarize the first and second order moment used in different optimizers. Moreover, we provide an insightful comparison on some optimizers through image classification. The results show that first order optimizers like AdaMod and Ranger not only have low computational cost, but also show great convergence speed. Meanwhile, the optimizers that can introduce curvature information such as Adabelief and Apollo, have a better generalization especially when optimizing complex network.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Comparative Study of Deep Learning Based Features in SLAM
    Deng, Chengqi
    Qiu, Kaitao
    Xiong, Rong
    Zhou, Chunlin
    2019 4TH ASIA-PACIFIC CONFERENCE ON INTELLIGENT ROBOT SYSTEMS (ACIRS 2019), 2019, : 250 - 254
  • [22] A Comparative Measurement Study of Deep Learning as a Service Framework
    Wu, Yanzhao
    Liu, Ling
    Pu, Calton
    Cao, Wenqi
    Sahin, Semih
    Wei, Wenqi
    Zhang, Qi
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (01) : 551 - 566
  • [23] Deep learning for hate speech detection: a comparative study
    Malik, Jitendra Singh
    Qiao, Hezhe
    Pang, Guansong
    van den Hengel, Anton
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [24] Deep Learning Algorithms for Image Retrieval: A comparative study
    Alenezi, Sara
    Alqarzaie, Khawla
    Alrasheed, Atheer
    Alrasheedi, Sabreen
    Selmi, Afef
    EDUCATION EXCELLENCE AND INNOVATION MANAGEMENT THROUGH VISION 2020, 2019, : 6791 - 6796
  • [25] Safety or Not? A Comparative Study for Deep Learning Apps on Smartphones
    Jin Au-yeung
    Wang, Shanshan
    Liu, Yuchen
    Chen, Zhenxiang
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 109 - 116
  • [26] A COMPARATIVE STUDY OF ROBUSTNESS OF DEEP LEARNING APPROACHES FOR VAD
    Tong, Sibo
    Gu, Hao
    Yu, Kai
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5695 - 5699
  • [27] Comparative Study for Tuberculosis Detection by Using Deep Learning
    Karaca, Busra Kubra
    Guney, Selda
    Dengiz, Berna
    Agildere, Muhtesem
    2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2021, : 88 - 91
  • [28] Deep learning in big data Analytics: A comparative study
    Jan, Bilal
    Farman, Haleem
    Khan, Murad
    Imran, Muhammad
    Ul Islam, Ihtesham
    Ahmad, Awais
    Ali, Shaukat
    Jeon, Gwanggil
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 75 : 275 - 287
  • [29] Comparative study on deep learning models in humor detections
    Wang, Chunyang
    Xin, Shiqi
    Yi, Murong
    2021 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, INFORMATION AND COMMUNICATION ENGINEERING, 2021, 11933
  • [30] Stock Price Forecasting with Deep Learning: A Comparative Study
    Shahi, Tej Bahadur
    Shrestha, Ashish
    Neupane, Arjun
    Guo, William
    MATHEMATICS, 2020, 8 (09)