Evolution-guided value iteration for optimal tracking control

被引:0
|
作者
Huang, Haiming
Wang, Ding [1 ]
Zhao, Mingming
Hu, Qinna
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Adaptive critic designs; Adaptive dynamic programming; Evolutionary computation; Intelligent control; Optimal tracking; Reinforcement learning; PARTICLE SWARM; REINFORCEMENT; CONVERGENCE; STABILITY; SYSTEMS;
D O I
10.1016/j.neucom.2024.127835
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, an evolution-guided value iteration (EGVI) algorithm is established to address optimal tracking problems for nonlinear nonaffine systems. Conventional adaptive dynamic programming algorithms rely on gradient information to improve the policy, which adheres to the first order necessity condition. Nonetheless, these methods encounter limitations when gradient information is intricate or system dynamics lack differentiability. In response to this challenge, evolutionary computation is leveraged by EGVI to search for the optimal policy without requiring an action network. The competition within the policy population serves as the driving force for policy improvement. Therefore, EGVI can effectively handle complex and non-differentiable systems. Additionally, this innovative method has the potential to enhance exploration efficiency and bolster the robustness of algorithms due to its population-based characteristics. Furthermore, the convergence of the algorithm and the stability of the policy are investigated based on the EGVI framework. Finally, the effectiveness of the established method is comprehensively demonstrated through two simulation experiments.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] An accelerated value/policy iteration scheme for optimal control problems and games
    University of Hamburg, Bundesstraße 55, Hamburg, Germany
    不详
    不详
    Lect. Notes Comput. Sci. Eng., (489-497):
  • [32] Evolution-guided evaluation of the inverted terminal repeats of the synthetic transposon Sleeping Beauty
    Barbara Scheuermann
    Tanja Diem
    Zoltán Ivics
    Miguel A. Andrade-Navarro
    Scientific Reports, 9
  • [33] Value iteration via output feedback for LQ optimal control of SISO systems
    Possieri, Corrado
    IFAC PAPERSONLINE, 2023, 56 (02): : 11861 - 11866
  • [34] Drift counteraction optimal control for deterministic systems and enhancing convergence of value iteration
    Zidek, Robert A. E.
    Kolmanovsky, Ilya V.
    AUTOMATICA, 2017, 83 : 108 - 115
  • [35] Value Iteration Algorithm for Optimal Consensus Control of Multi-agent Systems
    Zhang, Qichao
    Zhao, Dongbin
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VII, 2018, 11307 : 200 - 208
  • [36] Neuro-Optimal Trajectory Tracking With Value Iteration of Discrete-Time Nonlinear Dynamics
    Wang, Ding
    Ha, Mingming
    Cheng, Long
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4237 - 4248
  • [37] Stability Analysis of Optimal Adaptive Control Using Value Iteration With Approximation Errors
    Heydari, Ali
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (09) : 3119 - 3126
  • [38] Value Iteration Based Continuous-time Nonlinear Constrained Optimal Tracking Controller Design
    Xiao, Geyang
    Zhou, Boyang
    Lou, Kaiyi
    Chen, Zhengrong
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1875 - 1880
  • [39] Value Iteration and Adaptive Optimal Control for Linear Continuous-time Systems
    Bian, Tao
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 53 - 58
  • [40] Evolution-guided adaptation of an adenylation domain substrate specificity to an unusual amino acid
    Vobruba, Simon
    Kadlcik, Stanislav
    Gazak, Radek
    Janata, Jiri
    PLOS ONE, 2017, 12 (12):