An adaptive mechanism to achieve learning rate dynamically

被引：1

作者：

Jinjing Zhang

Fei Hu

Li Li

Xiaofei Xu

Zhanbo Yang

Yanbin Chen

机构：

[1] Southwest University,School of Computer and Information Science

[2] Chongqing University of Education,Network Centre

[3] Chongqing University,School of Computer Science

来源：

Neural Computing and Applications | 2019年 / 31卷

关键词：

Adaptive mechanism; Learning rate; Adaptive exponential decay rates; Gradient;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Gradient descent is prevalent for large-scale optimization problems in machine learning; especially it nowadays plays a major role in computing and correcting the connection strength of neural networks in deep learning. However, many gradient-based optimization methods contain more sensitive hyper-parameters which require endless ways of configuring. In this paper, we present a novel adaptive mechanism called adaptive exponential decay rate (AEDR). AEDR uses an adaptive exponential decay rate rather than a fixed and preconfigured one, and it can allow us to eliminate one otherwise tuning sensitive hyper-parameters. AEDR also can be used to calculate exponential decay rate adaptively by employing the moving average of both gradients and squared gradients over time. The mechanism is then applied to Adadelta and Adam; it reduces the number of hyper-parameters of Adadelta and Adam to only a single one to be turned. We use neural network of long short-term memory and LeNet to demonstrate how learning rate adapts dynamically. We show promising results compared with other state-of-the-art methods on four data sets, the IMDB (movie reviews), SemEval-2016 (sentiment analysis in twitter) (IMDB), CIFAR-10 and Pascal VOC-2012.

引用

页码：6685 / 6698

页数：13

共 50 条

[31] An adaptive learning rate GMM for background extraction
Sheng Zun-bing
Cui Xian-yu
OPTOELECTRONICS LETTERS, 2008, 4 (06) : 460 - 463
[32] Weight Convergence of SpikeProp and Adaptive Learning Rate
Shrestha, Sumit Bam
Song, Qing
2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 506 - 511
[33] Adaptive learning rate selection for backpropagation networks
Janakiraman, Jayathi
Honavar, Vasant
Microcomputer Applications, 1999, 18 (03): : 89 - 95
[34] Dynamics of structural learning with an adaptive forgetting rate
Miller, DA
Zurada, JM
1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 1827 - 1832
[35] Dynamically Adaptive Parsons Problems
Ericson, Barbara J.
PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH (ICER'16), 2016, : 269 - 270
[36] Dynamically adaptive microwave structures
Ford, K.L.
Chambers, B.
IEEE High Frequency Postgraduate Student Colloquium, 1999, : 54 - 59
[37] Learning patterns: A mechanism for the personalization of adaptive E-learning
Hewagamage, K. P.
Advances in Web Based Learning - ICWL 2006, 2006, 4181 : 57 - 65
[38] Evaluating the inference mechanism of adaptive learning systems
Weibelzahl, S
Weber, G
USER MODELING 2003, PROCEEDINGS, 2003, 2702 : 154 - 162
[39] Biased Learning as a Simple Adaptive Foraging Mechanism
Avgar, Tal
Berger-Tal, Oded
FRONTIERS IN ECOLOGY AND EVOLUTION, 2022, 9
[40] Adaptive Mechanism Design: Learning to Promote Cooperation
Baumann, Tobias
Graepel, Thore
Shawe-Taylor, John
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

← 1 2 3 4 5 →