Evolution-guided value iteration for optimal tracking control

被引:0
|
作者
Huang, Haiming
Wang, Ding [1 ]
Zhao, Mingming
Hu, Qinna
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Adaptive critic designs; Adaptive dynamic programming; Evolutionary computation; Intelligent control; Optimal tracking; Reinforcement learning; PARTICLE SWARM; REINFORCEMENT; CONVERGENCE; STABILITY; SYSTEMS;
D O I
10.1016/j.neucom.2024.127835
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, an evolution-guided value iteration (EGVI) algorithm is established to address optimal tracking problems for nonlinear nonaffine systems. Conventional adaptive dynamic programming algorithms rely on gradient information to improve the policy, which adheres to the first order necessity condition. Nonetheless, these methods encounter limitations when gradient information is intricate or system dynamics lack differentiability. In response to this challenge, evolutionary computation is leveraged by EGVI to search for the optimal policy without requiring an action network. The competition within the policy population serves as the driving force for policy improvement. Therefore, EGVI can effectively handle complex and non-differentiable systems. Additionally, this innovative method has the potential to enhance exploration efficiency and bolster the robustness of algorithms due to its population-based characteristics. Furthermore, the convergence of the algorithm and the stability of the policy are investigated based on the EGVI framework. Finally, the effectiveness of the established method is comprehensively demonstrated through two simulation experiments.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Optimal Trajectory Tracking Control for Automated Guided Vehicles
    Das, Amritam
    Kasemsinsup, Yanin
    Weiland, Siep
    IFAC PAPERSONLINE, 2017, 50 (01): : 303 - 308
  • [22] Adaptive optimal tracking control for nonlinear continuous-time systems with time delay using value iteration algorithm
    Shi, Jing
    Yue, Dong
    Xie, Xiangpeng
    NEUROCOMPUTING, 2020, 396 : 172 - 178
  • [23] Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
    Huang, Yi-Fei
    MOLECULAR BIOLOGY AND EVOLUTION, 2022, 39 (01)
  • [24] Evolution-Guided Discovery of Antimycobacterial Triculamin-Like Lasso Peptides
    Merrild, Aske
    Svenningsen, Tiziana
    Chevrette, Marc G.
    Torring, Thomas
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2025,
  • [25] Boundary Optimal Control for Parabolic Distributed Parameter Systems With Value Iteration
    Sun, Jingyi
    Luo, Biao
    Xu, Xiaodong
    Yang, Chunhua
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1571 - 1581
  • [26] General multi-step value iteration for optimal learning control
    Wang, Ding
    Wang, Jiangyu
    Liu, Derong
    Qiao, Junfei
    AUTOMATICA, 2025, 175
  • [27] An accelerated value/policy iteration scheme for optimal control problems and games
    Alla, Alessandro
    Falcone, Maurizio
    Kalise, Dante
    Lecture Notes in Computational Science and Engineering, 2015, 103 : 489 - 497
  • [28] Value Iteration, Adaptive Dynamic Programming, and Optimal Control of Nonlinear Systems
    Bian, Tao
    Jiang, Zhong-Ping
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 3375 - 3380
  • [29] On computing optimal policies in perishable inventory control using value iteration
    Hendrix, E. M. T.
    Ortega, G.
    Haijema, R.
    Buisman, M. E.
    Garcia, I
    COMPUTATIONAL AND MATHEMATICAL METHODS, 2019, 1 (04)
  • [30] Stochastic Drift Counteraction Optimal Control and Enhancing Convergence of Value Iteration
    Zidek, Robert A. E.
    Kolmanovsky, Ilya V.
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 1119 - 1124