Experimental Comparison of Stochastic Optimizers in Deep Learning

被引：18

作者：

Okewu, Emmanuel ^{[1
]}

Adewole, Philip ^{[2
]}

Sennaike, Oladipupo ^{[2
]}

机构：

[1] Univ Lagos, Ctr Informat Technol & Syst, Lagos, Nigeria

[2] Univ Lagos, Dept Comp Sci, Lagos, Nigeria

来源：

COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V | 2019年 / 11623卷

关键词：

Deep learning; Deep neural networks; Error function; Neural network parameters; Stochastic optimization; NEURAL-NETWORKS;

D O I：

10.1007/978-3-030-24308-1_55

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The stochastic optimization problem in deep learning involves finding optimal values of loss function and neural network parameters using a meta-heuristic search algorithm. The fact that these values cannot be reasonably obtained by using a deterministic optimization technique underscores the need for an iterative method that randomly picks data segments, arbitrarily determines initial values of optimization (network) parameters and steadily computes series of error functions until a tolerable error is attained. The typical stochastic optimization algorithm for training deep neural networks as a non-convex optimization problem is gradient descent. It has existing extensions like Stochastic Gradient Descent, Adagrad, Adadelta, RMSProp and Adam. In terms of accuracy, convergence rate and training time, each of these stochastic optimizers represents an improvement. However, there is room for further improvement. This paper presents outcomes of series of experiments conducted with a view to providing empirical evidences of successes made so far. We used Python deep learning libaries (Tensorflow and Keras API) for our experiments. Each algorithm is executed, results collated, and a case made for further research in deep learning to improve training time and convergence rate of deep neural network, as well as accuracy of outcomes. This is in response to the growing demands for deep learning in mission-critical and highly sophisticated decision making processes across industry verticals.

引用

页码：704 / 715

页数：12

共 50 条

[31] AN EXPERIMENTAL-STUDY OF STOCHASTIC LEARNING
ROUANET, H
ACTA PSYCHOLOGICA, 1961, 19 (01) : 338 - 339
[32] Circa: Stochastic ReLUs for Private Deep Learning
Ghodsi, Zahra
Jha, Nandan Kumar
Reagen, Brandon
Garg, Siddharth
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[33] Classification of stochastic processes based on deep learning
Al-Murisi, Shamsan A.
Tang, Xiangong
Deng, Weihua
JOURNAL OF PHYSICS-COMPLEXITY, 2024, 5 (01):
[34] Stochastic Least Squares Learning for Deep Architectures
Kumar, Girish
Sim, Jian Min
Cheu, Eng Yeow
Li, Xiaoli
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
[35] Stochastic Gradient Push for Distributed Deep Learning
Assran, Mahmoud
Loizou, Nicolas
Ballas, Nicolas
Rabbat, Michael
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[36] Stochastic Integrated ActorCritic for Deep Reinforcement Learning
Zheng, Jiaohao
Kurt, Mehmet Necip
Wang, Xiaodong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6654 - 6666
[37] An Experimental Comparison between Deep Learning and Classical Machine Learning Approaches for Writer Identification in Medieval Documents
Cilia, Nicole Dalia
De Stefano, Claudio
Fontanella, Francesco
Marrocco, Claudio
Molinara, Mario
Freca, Alessandra Scotto di
JOURNAL OF IMAGING, 2020, 6 (09)
[38] Deep Learning Identifies Tomato Leaf Disease by Comparing Four Architectures Using Two Types of Optimizers
Bouni, Mohamed
Hssina, Badr
Douzi, Khadija
Douzi, Samira
ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2021, 2022, 1534 : 263 - 273
[39] Toolkit for the Automatic Comparison of Optimizers: comparing large-scale global optimizers made easy
Molina, Daniel
LaTorre, Antonio
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 1229 - 1236
[40] Evolutionary Pareto optimizers for continuous review stochastic inventory systems
Tsou, Ching-Shih
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 195 (02) : 364 - 371

← 1 2 3 4 5 →