On the Trend-corrected Variant of Adaptive Stochastic Optimization Methods

被引:0
|
作者
Zhou, Bingxin [1 ]
Zheng, Xuebin [1 ]
Gao, Junbin [1 ]
机构
[1] Univ Sydney, Business Sch, Sydney, NSW, Australia
关键词
Stochastic Gradient Descent; ADAM; Deep Learning; Optimization;
D O I
10.1109/ijcnn48605.2020.9207166
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adam-type optimizers, as a class of adaptive moment estimation methods with the exponential moving average scheme, have been successfully used in many applications of deep learning. Such methods are appealing due to the capability on large-scale sparse datasets with high computational efficiency. In this paper, we present a new framework for Adam-type methods with the trend information when updating the parameters with the adaptive step size and gradients. The additional terms in the algorithm promise an efficient movement on the complex cost surface, and thus the loss would converge more rapidly. We show empirically the importance of adding the trend component, where our framework outperforms the conventional Adam and AMSGrad methods constantly on the classical models with several real-world datasets.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] An adaptive stochastic optimization algorithm for resource allocation
    Fontaine, Xavier
    Mannor, Shie
    Perchet, Vianney
    ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 319 - 363
  • [42] OPTIMIZATION OF STOCHASTIC BLACKBOXES WITH ADAPTIVE PRECISION\ast
    Alarie, Stephane
    Audet, Charles
    Bouchet, Pierre-Yves
    Le Digabel, Sebastien
    SIAM JOURNAL ON OPTIMIZATION, 2021, 31 (04) : 3127 - 3156
  • [43] Adaptive noisy importance sampling for stochastic optimization
    Deniz Akyildiz, Omer
    Marino, Ines P.
    Miguez, Joaquin
    2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2017,
  • [44] Distributed Stochastic Optimization via Adaptive SGD
    Cutkosky, Ashok
    Busa-Fekete, Robert
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [45] AdaDelay: Delay Adaptive Distributed Stochastic Optimization
    Sra, Suvrit
    Yu, Adams Wei
    Li, Mu
    Smola, Alexander J.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 957 - 965
  • [46] Adaptive Stochastic Convex Optimization Over Networks
    Towfic, Zaid J.
    Sayed, Ali H.
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 1272 - 1277
  • [47] ADAPTIVE STEP ADJUSTMENT FOR A STOCHASTIC OPTIMIZATION ALGORITHM
    MIRZOAKHMEDOV, F
    URYASEV, SP
    USSR COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 1983, 23 (06): : 20 - 27
  • [48] Delay-Adaptive Distributed Stochastic Optimization
    Ren, Zhaolin
    Zhou, Zhengyuan
    Qiu, Linhai
    Deshpande, Ajay
    Kalagnanam, Jayant
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5503 - 5510
  • [49] Adaptive Term Weighting through Stochastic Optimization
    Granitzer, Michael
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2010, 6008 : 614 - 626
  • [50] Adaptive Stochastic Mirror Descent for Constrained Optimization
    Bayandina, Anastasia
    2017 CONSTRUCTIVE NONSMOOTH ANALYSIS AND RELATED TOPICS (DEDICATED TO THE MEMORY OF V.F. DEMYANOV) (CNSA), 2017, : 40 - 43