A GRADIENT DESCENT SARSA(λ) ALGORITHM BASED ON THE ADAPTIVE REWARD-SHAPING MECHANISM

被引:0
|
作者
Liu, Quan [1 ]
Fu, QiMing [1 ]
Xiao, Fei [1 ]
Fu, YuChen [1 ]
机构
[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China
来源
INTELLIGENT AUTOMATION AND SOFT COMPUTING | 2013年 / 19卷 / 04期
关键词
reinforcement learning; Sarsa (lambda); gradient descent; reward-shaping; adaptive;
D O I
10.1080/10798587.2013.869119
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Based on the adaptive reward-shaping mechanism, we propose a novel gradient descent (GD) Sarsa(lambda) algorithm to solve the problems of ill initial performance and low convergence speed in the reinforcement learning tasks with continuous state space. Adaptive normalized radial basis function (ANRBF) network is used to shape reward. The reward-shaping mechanism propagates model knowledge to the learner in the form of the additional reward signal so that the initial performance and convergence speed can be improved effectively. A function approximation algorithm named ANRBF-GD-Sarsa(lambda) is proposed based on the ANRBF network. The convergence of ANRBF-GD-Sarsa(lambda) is analyzed theoretically. Experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.
引用
收藏
页码:599 / 612
页数:14
相关论文
共 50 条
  • [41] A Feature Extraction Algorithm for Corner Cracks in Slabs Based on Multi-Scale Adaptive Gradient Descent
    Zeng, Kai
    Xia, Zibo
    Qian, Junlei
    Du, Xueqiang
    Xiao, Pengcheng
    Zhu, Liguang
    METALS, 2025, 15 (03)
  • [42] Analysis on residual error for adaptive optical system based on stochastic parallel gradient descent control algorithm
    Zhou P.
    Wang X.
    Ma Y.
    Ma H.
    Xu X.
    Liu Z.
    Guangxue Xuebao/Acta Optica Sinica, 2010, 30 (03): : 631 - 617
  • [43] An adaptive gradient descent-based local search in memetic algorithm applied to optimal controller design
    Arab, Aliasghar
    Alfi, Alireza
    INFORMATION SCIENCES, 2015, 299 : 117 - 142
  • [44] Recurrent neural tracking control based on multivariable robust adaptive gradient-descent training algorithm
    Xu, Zhao
    Song, Qing
    Wang, Danwei
    NEURAL COMPUTING & APPLICATIONS, 2012, 21 (07): : 1745 - 1755
  • [45] Adaptive Gradient Descent Algorithm for Networked Control Systems Using Redundant Rule
    Lv, Lixin
    Zhang, Jian
    IEEE ACCESS, 2021, 9 : 41669 - 41675
  • [47] Optimization of stochastic parallel gradient descent algorithm for adaptive optics in atmospheric turbulence
    Chen B.
    Li X.
    Jiang W.
    Zhongguo Jiguang/Chinese Journal of Lasers, 2010, 37 (04): : 959 - 964
  • [48] Theoretical Analysis of Stochastic Parallel Gradient Descent Control Algorithm in Adaptive Optics
    Yang, Huizhen
    Li, Xinyang
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL II, 2009, : 338 - +
  • [49] Adaptive optical confocal fluorescence microscope with stochastic parallel gradient descent algorithm
    He, Yi
    Wang, Zhibin
    Wei, Ling
    Li, Xiqi
    Yang, Jinsheng
    Zhang, Yudong
    2016 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2016,
  • [50] Adaptive wavefront correction using a VLSI implementation of the parallel gradient descent algorithm
    Carhart, GW
    Vorontsov, MA
    Cohen, M
    Cauwenberghs, G
    Edwards, RT
    HIGH-RESOLUTION WAVEFRONT CONTROL: METHODS, DEVICES, AND APPLICATIONS, 1999, 3760 : 61 - 66