An adaptive mechanism to achieve learning rate dynamically

被引:1
|
作者
Jinjing Zhang
Fei Hu
Li Li
Xiaofei Xu
Zhanbo Yang
Yanbin Chen
机构
[1] Southwest University,School of Computer and Information Science
[2] Chongqing University of Education,Network Centre
[3] Chongqing University,School of Computer Science
来源
关键词
Adaptive mechanism; Learning rate; Adaptive exponential decay rates; Gradient;
D O I
暂无
中图分类号
学科分类号
摘要
Gradient descent is prevalent for large-scale optimization problems in machine learning; especially it nowadays plays a major role in computing and correcting the connection strength of neural networks in deep learning. However, many gradient-based optimization methods contain more sensitive hyper-parameters which require endless ways of configuring. In this paper, we present a novel adaptive mechanism called adaptive exponential decay rate (AEDR). AEDR uses an adaptive exponential decay rate rather than a fixed and preconfigured one, and it can allow us to eliminate one otherwise tuning sensitive hyper-parameters. AEDR also can be used to calculate exponential decay rate adaptively by employing the moving average of both gradients and squared gradients over time. The mechanism is then applied to Adadelta and Adam; it reduces the number of hyper-parameters of Adadelta and Adam to only a single one to be turned. We use neural network of long short-term memory and LeNet to demonstrate how learning rate adapts dynamically. We show promising results compared with other state-of-the-art methods on four data sets, the IMDB (movie reviews), SemEval-2016 (sentiment analysis in twitter) (IMDB), CIFAR-10 and Pascal VOC-2012.
引用
收藏
页码:6685 / 6698
页数:13
相关论文
共 50 条
  • [31] An adaptive learning rate GMM for background extraction
    Sheng Zun-bing
    Cui Xian-yu
    OPTOELECTRONICS LETTERS, 2008, 4 (06) : 460 - 463
  • [32] Weight Convergence of SpikeProp and Adaptive Learning Rate
    Shrestha, Sumit Bam
    Song, Qing
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 506 - 511
  • [33] Adaptive learning rate selection for backpropagation networks
    Janakiraman, Jayathi
    Honavar, Vasant
    Microcomputer Applications, 1999, 18 (03): : 89 - 95
  • [34] Dynamics of structural learning with an adaptive forgetting rate
    Miller, DA
    Zurada, JM
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 1827 - 1832
  • [35] Dynamically Adaptive Parsons Problems
    Ericson, Barbara J.
    PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH (ICER'16), 2016, : 269 - 270
  • [36] Dynamically adaptive microwave structures
    Ford, K.L.
    Chambers, B.
    IEEE High Frequency Postgraduate Student Colloquium, 1999, : 54 - 59
  • [37] Learning patterns: A mechanism for the personalization of adaptive E-learning
    Hewagamage, K. P.
    Advances in Web Based Learning - ICWL 2006, 2006, 4181 : 57 - 65
  • [38] Evaluating the inference mechanism of adaptive learning systems
    Weibelzahl, S
    Weber, G
    USER MODELING 2003, PROCEEDINGS, 2003, 2702 : 154 - 162
  • [39] Biased Learning as a Simple Adaptive Foraging Mechanism
    Avgar, Tal
    Berger-Tal, Oded
    FRONTIERS IN ECOLOGY AND EVOLUTION, 2022, 9
  • [40] Adaptive Mechanism Design: Learning to Promote Cooperation
    Baumann, Tobias
    Graepel, Thore
    Shawe-Taylor, John
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,