The Two-Stage PI2 Control Strategy

被引:2
|
作者
Varnai, Peter [1 ]
Dimarogonas, Dimos, V [1 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Div Decis & Control Syst, S-11428 Stockholm, Sweden
来源
基金
瑞典研究理事会;
关键词
Costs; Feedforward systems; Trajectory; Optimal control; System dynamics; Reinforcement learning; Real-time systems; Stochastic optimal control; path integral policy improvement; Feynman-Kac theorem; nonlinear control systems; PATH-INTEGRAL CONTROL;
D O I
10.1109/LCSYS.2021.3137133
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
PI2 is a stochastic optimal control method generally regarded as a reinforcement learning algorithm. Recent work, however, suggests that the reinforcement learning aspect of PI2 actually appears when optimizing feedforward controls which will lead to optimal closed-loop performance once combined with feedback controls. These feedbacks are necessary to achieve the predicted performance, yet have been largely neglected in the literature and applications due to their complexity. In this letter, we show that the feedbacks actually take a simple-to-implement form for a wide range of system dynamics, paving way for future research and applications of PI2. The correctness of the results is demonstrated through numerical simulations.
引用
收藏
页码:2072 / 2077
页数:6
相关论文
共 50 条
  • [21] Review of Pi2 Models
    Andreas Keiling
    Kazue Takahashi
    Space Science Reviews, 2011, 161 : 63 - 148
  • [22] Review of Pi2 Models
    Keiling, Andreas
    Takahashi, Kazue
    SPACE SCIENCE REVIEWS, 2011, 161 (1-4) : 63 - 148
  • [23] Two-Stage Disturbance Rejection Control Strategy for Airport Refueling Systems Based on Predictive Control
    Liu, Peng
    Gong, Jing
    Shi, Bohui
    Song, Shangfei
    JOURNAL OF PIPELINE SYSTEMS ENGINEERING AND PRACTICE, 2024, 15 (02)
  • [24] PI2 PULSATIONS IN MAGNETOSPHERE
    LIN, CC
    CAHILL, LJ
    PLANETARY AND SPACE SCIENCE, 1975, 23 (04) : 693 - 711
  • [25] On the Tritronquee Solutions of PI2
    Grava, Tamara
    Kapaev, Andrei
    Klein, Christian
    CONSTRUCTIVE APPROXIMATION, 2015, 41 (03) : 425 - 466
  • [26] MECHANISM FOR PI2 GENERATION
    STUART, WF
    GREEN, CA
    GEOPHYSICAL JOURNAL OF THE ROYAL ASTRONOMICAL SOCIETY, 1979, 57 (01): : 297 - 297
  • [27] OCCURRENCE OF PI2 MICROPULSATIONS
    SMITH, BP
    PLANETARY AND SPACE SCIENCE, 1973, 21 (05) : 831 - 837
  • [28] Two-stage control for container cranes
    Hong, KS
    Park, BJ
    Lee, MH
    JSME INTERNATIONAL JOURNAL SERIES C-MECHANICAL SYSTEMS MACHINE ELEMENTS AND MANUFACTURING, 2000, 43 (02): : 273 - 282
  • [29] Production control in a two-stage system
    Li, Hui
    Liu, Liming
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 174 (02) : 887 - 904
  • [30] SOUSLIN OPERATION FOR PI2
    MANSFIELD, R
    ISRAEL JOURNAL OF MATHEMATICS, 1971, 9 (03) : 367 - +