A GRADIENT DESCENT SARSA(λ) ALGORITHM BASED ON THE ADAPTIVE REWARD-SHAPING MECHANISM

被引:0
|
作者
Liu, Quan [1 ]
Fu, QiMing [1 ]
Xiao, Fei [1 ]
Fu, YuChen [1 ]
机构
[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China
来源
INTELLIGENT AUTOMATION AND SOFT COMPUTING | 2013年 / 19卷 / 04期
关键词
reinforcement learning; Sarsa (lambda); gradient descent; reward-shaping; adaptive;
D O I
10.1080/10798587.2013.869119
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Based on the adaptive reward-shaping mechanism, we propose a novel gradient descent (GD) Sarsa(lambda) algorithm to solve the problems of ill initial performance and low convergence speed in the reinforcement learning tasks with continuous state space. Adaptive normalized radial basis function (ANRBF) network is used to shape reward. The reward-shaping mechanism propagates model knowledge to the learner in the form of the additional reward signal so that the initial performance and convergence speed can be improved effectively. A function approximation algorithm named ANRBF-GD-Sarsa(lambda) is proposed based on the ANRBF network. The convergence of ANRBF-GD-Sarsa(lambda) is analyzed theoretically. Experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.
引用
收藏
页码:599 / 612
页数:14
相关论文
共 50 条
  • [31] Tip-tilt adaptive correction based on stochastic parallel gradient descent optimization algorithm
    Ma, Huimin
    Zhang, Pengfei
    Zhang, Jinghui
    Qiao, Chunhong
    Fan, Chengyu
    OPTICAL DESIGN AND TESTING IV, 2010, 7849
  • [32] Solving Electromagnetic Inverse Problem Using Adaptive Gradient Descent Algorithm
    Liu, Lian
    Yang, Bo
    Zhang, Yi
    Xu, Yixian
    Peng, Zhong
    Wang, Feng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [33] A generalized adaptive gradient descent algorithm for the deconvolution of noisy blurred images
    Zhu, D
    Razaz, M
    Lee, R
    PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 789 - 792
  • [34] A normalised adaptive amplitude nonlinear gradient descent algorithm for system identification
    Boukis, CG
    Papoulis, EV
    ICECS 2003: PROCEEDINGS OF THE 2003 10TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS 1-3, 2003, : 1042 - 1045
  • [35] Research on the Quadrotor of AHRS based on Gradient Descent Algorithm
    Lin Feng
    He Liuzeng
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 1831 - 1834
  • [36] Hinge Classification Algorithm Based on Asynchronous Gradient Descent
    Yan, Xiaodan
    Zhang, Tianxin
    Cui, Baojiang
    Deng, Jiangdong
    ADVANCES ON BROAD-BAND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS, BWCCA-2017, 2018, 12 : 459 - 468
  • [37] Multifactorial Evolutionary Algorithm Based on Diffusion Gradient Descent
    Liu, Zhaobo
    Li, Guo
    Zhang, Haili
    Liang, Zhengping
    Zhu, Zexuan
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (07) : 4267 - 4279
  • [38] Simultaneous adaptive control of dual deformable mirrors for full-field beam shaping with the improved stochastic parallel gradient descent algorithm
    Ma, Haotong
    Liu, Zejin
    Xu, Xiaojun
    Chen, Jinbao
    OPTICS LETTERS, 2013, 38 (03) : 326 - 328
  • [39] Quaternion-based Kalman Filter for AHRS Using an Adaptive-step Gradient Descent Algorithm
    Wang, Li
    Zhang, Zheng
    Sun, Ping
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2015, 12
  • [40] Recurrent neural tracking control based on multivariable robust adaptive gradient-descent training algorithm
    Zhao Xu
    Qing Song
    Danwei Wang
    Neural Computing and Applications, 2012, 21 : 1745 - 1755