A GRADIENT DESCENT SARSA(λ) ALGORITHM BASED ON THE ADAPTIVE REWARD-SHAPING MECHANISM

被引:0
|
作者
Liu, Quan [1 ]
Fu, QiMing [1 ]
Xiao, Fei [1 ]
Fu, YuChen [1 ]
机构
[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China
来源
INTELLIGENT AUTOMATION AND SOFT COMPUTING | 2013年 / 19卷 / 04期
关键词
reinforcement learning; Sarsa (lambda); gradient descent; reward-shaping; adaptive;
D O I
10.1080/10798587.2013.869119
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Based on the adaptive reward-shaping mechanism, we propose a novel gradient descent (GD) Sarsa(lambda) algorithm to solve the problems of ill initial performance and low convergence speed in the reinforcement learning tasks with continuous state space. Adaptive normalized radial basis function (ANRBF) network is used to shape reward. The reward-shaping mechanism propagates model knowledge to the learner in the form of the additional reward signal so that the initial performance and convergence speed can be improved effectively. A function approximation algorithm named ANRBF-GD-Sarsa(lambda) is proposed based on the ANRBF network. The convergence of ANRBF-GD-Sarsa(lambda) is analyzed theoretically. Experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.
引用
收藏
页码:599 / 612
页数:14
相关论文
共 50 条
  • [21] Stochastic parallel gradient descent algorithm for adaptive optics system
    Ma H.
    Zhang P.
    Zhang J.
    Fan C.
    Wang Y.
    Qiangjiguang Yu Lizishu/High Power Laser and Particle Beams, 2010, 22 (06): : 1206 - 1210
  • [22] A gradient descent based algorithm for lp minimization
    Jiang, Shan
    Fang, Shu-Cherng
    Nie, Tiantian
    Xing, Wenxun
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 283 (01) : 47 - 56
  • [23] Image registration algorithm based on gradient descent
    Zhao, Xinbo
    Zou, Xiaochun
    Zhang, Dinghua
    Zhang, Shunli
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2007, 25 (05): : 642 - 645
  • [24] Algorithm for Data Balancing Based on Gradient Descent
    Mukhin, A., V
    Kilbas, I. A.
    Paringer, R. A.
    Ilyasova, N. Yu
    Kupriyanov, A., V
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 56 - 59
  • [25] A general adaptive normalised nonlinear gradient descent algorithm for nonlinear adaptive filters
    Mandic, DR
    Hanna, AI
    Kim, DI
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1353 - 1356
  • [26] Adaptive Gradient Estimation Stochastic Parallel Gradient Descent Algorithm for Laser Beam Cleanup
    Ma, Shiqing
    Yang, Ping
    Lai, Boheng
    Su, Chunxuan
    Zhao, Wang
    Yang, Kangjian
    Jin, Ruiyan
    Cheng, Tao
    Xu, Bing
    PHOTONICS, 2021, 8 (05)
  • [27] Gradient descent-based robust adaptive beamforming
    Song, Xin
    Wang, Jinkuan
    Han, Yinghua
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 12 : 444 - 456
  • [28] AFCGD: an adaptive fuzzy classifier based on gradient descent
    Homeira Shahparast
    Eghbal G. Mansoori
    Mansoor Zolghadri Jahromi
    Soft Computing, 2019, 23 : 4557 - 4571
  • [29] Adaptive Beamforming Based On Stochastic Parallel Gradient Descent Algorithm For Single Receiver Phased Array
    Zhao, Haijun
    Zhang, Jing
    Yin, Zhiping
    2014 2ND INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2014, : 849 - 853
  • [30] AFCGD: an adaptive fuzzy classifier based on gradient descent
    Shahparast, Homeira
    Mansoori, Eghbal G.
    Jahromi, Mansoor Zolghadri
    SOFT COMPUTING, 2019, 23 (12) : 4557 - 4571