Experimental Comparison of Stochastic Optimizers in Deep Learning

被引:18
|
作者
Okewu, Emmanuel [1 ]
Adewole, Philip [2 ]
Sennaike, Oladipupo [2 ]
机构
[1] Univ Lagos, Ctr Informat Technol & Syst, Lagos, Nigeria
[2] Univ Lagos, Dept Comp Sci, Lagos, Nigeria
关键词
Deep learning; Deep neural networks; Error function; Neural network parameters; Stochastic optimization; NEURAL-NETWORKS;
D O I
10.1007/978-3-030-24308-1_55
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The stochastic optimization problem in deep learning involves finding optimal values of loss function and neural network parameters using a meta-heuristic search algorithm. The fact that these values cannot be reasonably obtained by using a deterministic optimization technique underscores the need for an iterative method that randomly picks data segments, arbitrarily determines initial values of optimization (network) parameters and steadily computes series of error functions until a tolerable error is attained. The typical stochastic optimization algorithm for training deep neural networks as a non-convex optimization problem is gradient descent. It has existing extensions like Stochastic Gradient Descent, Adagrad, Adadelta, RMSProp and Adam. In terms of accuracy, convergence rate and training time, each of these stochastic optimizers represents an improvement. However, there is room for further improvement. This paper presents outcomes of series of experiments conducted with a view to providing empirical evidences of successes made so far. We used Python deep learning libaries (Tensorflow and Keras API) for our experiments. Each algorithm is executed, results collated, and a case made for further research in deep learning to improve training time and convergence rate of deep neural network, as well as accuracy of outcomes. This is in response to the growing demands for deep learning in mission-critical and highly sophisticated decision making processes across industry verticals.
引用
收藏
页码:704 / 715
页数:12
相关论文
共 50 条
  • [31] AN EXPERIMENTAL-STUDY OF STOCHASTIC LEARNING
    ROUANET, H
    ACTA PSYCHOLOGICA, 1961, 19 (01) : 338 - 339
  • [32] Circa: Stochastic ReLUs for Private Deep Learning
    Ghodsi, Zahra
    Jha, Nandan Kumar
    Reagen, Brandon
    Garg, Siddharth
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [33] Classification of stochastic processes based on deep learning
    Al-Murisi, Shamsan A.
    Tang, Xiangong
    Deng, Weihua
    JOURNAL OF PHYSICS-COMPLEXITY, 2024, 5 (01):
  • [34] Stochastic Least Squares Learning for Deep Architectures
    Kumar, Girish
    Sim, Jian Min
    Cheu, Eng Yeow
    Li, Xiaoli
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [35] Stochastic Gradient Push for Distributed Deep Learning
    Assran, Mahmoud
    Loizou, Nicolas
    Ballas, Nicolas
    Rabbat, Michael
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [36] Stochastic Integrated ActorCritic for Deep Reinforcement Learning
    Zheng, Jiaohao
    Kurt, Mehmet Necip
    Wang, Xiaodong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6654 - 6666
  • [37] An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents
    Cilia, Nicole Dalia
    De Stefano, Claudio
    Fontanella, Francesco
    Marrocco, Claudio
    Molinara, Mario
    Freca, Alessandra Scotto di
    JOURNAL OF IMAGING, 2020, 6 (09)
  • [38] Deep Learning Identifies Tomato Leaf Disease by Comparing Four Architectures Using Two Types of Optimizers
    Bouni, Mohamed
    Hssina, Badr
    Douzi, Khadija
    Douzi, Samira
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2021, 2022, 1534 : 263 - 273
  • [39] Toolkit for the Automatic Comparison of Optimizers: comparing large-scale global optimizers made easy
    Molina, Daniel
    LaTorre, Antonio
    2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 1229 - 1236
  • [40] Evolutionary Pareto optimizers for continuous review stochastic inventory systems
    Tsou, Ching-Shih
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 195 (02) : 364 - 371