More on training strategies for critic and action neural networks in dual heuristic programming method

被引:0
|
作者
Lendaris, GG
Paintz, C
Shannon, T
机构
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper for the special session on Adaptive Critic Design Methods at the SMC '97 Conference describes a modification to the (to date) usual procedures reported for training the Critic and Action neural networks in the Dual Heuristic Programming (DHP) method [7]-[12]. This modification entails updating both the Critic and the Action networks each computational cycle, rather than only one at a time. The distinction lies in the introduction of a (real) second copy of the Critic network whose weights are adjusted less often (once per ''epoch'', where the epoch is defined to comprise Some number N>I computational cycles), and the ''desired value'' for training the other Critic is obtained from this Critic-Copy. In a previous publication [4], the proposed modified training strategy was demonstrated on the well-known pole-cart controller problem. In that paper, the full. 6 dimensional state vector was input to the Critic and Action NNs, however, the utility function only involved pole angle, not distance along the track (x). For the first set of results presented here, the 3 states associated with the x variable were eliminated from the inputs to the NNs, keeping the same utility function previously defined. This resulted in improved learning and controller performance. From this point, the method is applied to two additional problems, each of increasing complexity: for the first, an x-related term is added to the utility function for the pole-cart problem, and simultaneously, the x-related states were added back in to the NNs (i.e., increase number of state variables used from 3 to 6); the second relates to steering a vehicle with independent drive motors on each wheel. The problem contexts and experimental results are provided.
引用
收藏
页码:3067 / 3072
页数:6
相关论文
共 50 条
  • [1] Training strategies for critic and action neural networks in dual heuristic programming method
    Lendaris, GG
    Paintz, C
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 712 - 717
  • [2] Probabilistic dual heuristic programming-based adaptive critic
    Herzallah, Randa
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2010, 41 (02) : 227 - 239
  • [3] Convergence analysis of the deep neural networks based globalized dual heuristic programming
    Kim, Jong Woo
    Oh, Tae Hoon
    Son, Sang Hwan
    Jeong, Dong Hwi
    Lee, Jong Min
    AUTOMATICA, 2020, 122
  • [4] Adaptive critic based neurocontroller for turbogenerators with global dual heuristic programming
    Venayagamoorthy, GK
    Wunsch, DC
    Harley, RG
    2000 IEEE POWER ENGINEERING SOCIETY WINTER MEETING - VOLS 1-4, CONFERENCE PROCEEDINGS, 2000, : 291 - 294
  • [5] Adaptive critic fault tolerant control using dual heuristic programming
    Yen, GG
    Lima, PG
    PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 1814 - 1819
  • [6] Local Critic Training of Deep Neural Networks
    Lee, Hojung
    Lee, Jong-Seok
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [7] Adaptive critic motion control design of autonomous wheeled mobile robot by dual heuristic programming
    Lin, Wei-Song
    Yang, Ping-Chieh
    AUTOMATICA, 2008, 44 (11) : 2716 - 2723
  • [8] Action Dependent Dual Heuristic Programming Solution for the Dynamic Graphical Games
    Abouheaf, Mohammed I.
    Lewis, Frank L.
    Mahmoud, Magdi S.
    2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2741 - 2746
  • [9] Strategies for training optical neural networks
    Qipeng Yang
    Bowen Bai
    Weiwei Hu
    Xingjun Wang
    National Science Open, 2022, 1 (03) : 7 - 11
  • [10] Stability Analysis of Batch Offline Action-Dependent Heuristic Dynamic Programming Using Deep Neural Networks
    Lala, Timotei
    MATHEMATICS, 2025, 13 (02)