OPEN-LOOP OPTIMAL CONTROL FOR TRACKING A REFERENCE SIGNAL WITH APPROXIMATE DYNAMIC PROGRAMMING

被引：0

作者：

Diaz, Jorge A. ^{[1
]}

Xu, Lei ^{[2
]}

Sardarmehni, Tohid ^{[3
]}

机构：

[1] Univ Texas Rio Grande Valley, Dept Mech Engn, Edinburg, TX 78539 USA

[2] Kent State Univ, Dept Comp Sci, Kent, OH 44242 USA

[3] Calif State Univ Northridge, Dept Mech Engn, Northridge, CA 91330 USA

来源：

PROCEEDINGS OF ASME 2022 INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, IMECE2022, VOL 5 | 2022年

基金：

美国国家科学基金会;

关键词：

optimal control; approximate dynamic programming; dynamic programming; neural networks;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic programming (DP) provides a systematic, closed-loop solution for optimal control problems. However, it suffers from the curse of dimensionality in higher orders. Approximate dynamic programming (ADP) methods can remedy this by finding near-optimal rather than exact optimal solutions. In summary, ADP uses function approximators, such as neural networks, to approximate optimal control solutions. ADP can then converge to the near-optimal solution using techniques such as reinforcement learning (RL). The two main challenges in using this approach are finding a proper training domain and selecting a suitable neural network architecture for precisely approximating the solutions with RL. Users select the training domain and the neural networks mostly by trial and error, which is tedious and time-consuming. This paper proposes trading the closed-loop solution provided by ADP methods for more effectively selecting the domain of training. To do so, we train a neural network using a small and moving domain around the reference signal. We asses the method's effectiveness by applying it to a widely used benchmark problem, the Van der Pol oscillator; and a real-world problem, controlling a quadrotor to track a reference trajectory. Simulation results demonstrate comparable performance to traditional methods while reducing computational requirements.

引用

页数：7

共 50 条

[41] Open-loop and optimal control of cylinder wake via electromagnetic fields
ZHANG Hui
ChineseScienceBulletin, 2008, (19) : 2946 - 2952
[42] Offset Risk Minimization for Open-loop Optimal Control of Oil Reservoirs
Capolei, A.
Christiansen, L. H.
Jorgensen, J. B.
IFAC PAPERSONLINE, 2017, 50 (01): : 10620 - 10625
[43] Inverse Open-Loop Noncooperative Differential Games and Inverse Optimal Control
Molloy, Timothy L.
Inga, Jairo
Flad, Michael
Ford, Jason J.
Perez, Tristan
Hohmann, Soeren
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (02) : 897 - 904
[44] On the optimal open-loop control policy for deterministic and exponential polling systems
Gaujal, Bruno
Hordijk, Arie
van der Laan, Dinard
PROBABILITY IN THE ENGINEERING AND INFORMATIONAL SCIENCES, 2007, 21 (02) : 157 - 187
[45] Open-loop stable solutions of periodic optimal control problems in robotics
Mombaur, KD
Bock, HG
Schlöder, JP
Longman, RW
ZAMM-ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 2005, 85 (07): : 499 - 515
[46] SYNTHESIS OF OPEN-LOOP OPTIMAL CONTROL WITH ZERO SENSITIVE TERMINAL CONSTRAINTS
SAWARAGI, Y
INOUE, K
ASAI, K
AUTOMATICA, 1969, 5 (03) : 389 - &
[47] Modeling human open-loop tracking behavior
Davidson, PR
Jones, RD
Andreae, JH
Sirisena, HR
PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 836 - 839
[48] An approximate dynamic programming strategy for responsive traffic signal control
Cai, Chen
2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007, : 303 - 310
[49] Approximate Dynamic Programming for Traffic Signal Control at Isolated Intersection
Yin, Biao
Dridi, Mahjoub
El Moudni, Abdellah
MODERN TRENDS AND TECHNIQUES IN COMPUTER SCIENCE (CSOC 2014), 2014, 285 : 369 - 381
[50] Adaptive traffic signal control using approximate dynamic programming
Cai, Chen
Wong, Chi Kwong
Heydecker, Benjamin G.
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2009, 17 (05) : 456 - 474

← 1 2 3 4 5 →