Policy Iteration Based Approximate Dynamic Programming Toward Autonomous Driving in Constrained Dynamic Environment

被引：21

作者：

Lin, Ziyu ^{[1
]}

Ma, Jun ^{[2
,3
]}

Duan, Jingliang ^{[4
]}

Li, Shengbo Eben ^{[1
]}

Ma, Haitong ^{[1
]}

Cheng, Bo ^{[1
]}

Lee, Tong Heng ^{[5
]}

机构：

[1] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China

[2] Hong Kong Univ Sci & Technol Guangzhou, Robot & Autonomous Syst Thrust, Guangzhou, Peoples R China

[3] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China

[4] Univ Sci & Technol Beijing, Sch Mech Engn, Beijing 100083, Peoples R China

[5] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117583, Singapore

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 05期

基金：

国家重点研发计划;

关键词：

Planning; Autonomous vehicles; Vehicle dynamics; Task analysis; Heuristic algorithms; Approximation algorithms; Roads; Autonomous driving; approximate dynamic programming; motion planning; constrained optimization; reinforcement learning; VEHICLE;

D O I：

10.1109/TITS.2023.3237568

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

In the area of autonomous driving, it typically brings great difficulty in solving the motion planning problem since the vehicle model is nonlinear and the driving scenarios are complex. Particularly, most of the existing methods cannot be generalized to dynamically changing scenarios with varying surrounding vehicles. To address this problem, this development here investigates the framework of integrated decision and control. As part of the modules, static path planning determines the reference candidates ahead, and then the optimal path-tracking controller realizes the specific autonomous driving task. An innovative and effective constrained finite-horizon approximate dynamic programming (ADP) algorithm is herein presented to generate the desired control policy for effective path tracking. With the generalized policy neural network that maps from the state to the control input, the proposed algorithm preserves the high effectiveness for the motion planning problem towards changing driving environments with varying surrounding vehicles. Moreover, the algorithm attains the noteworthy advantage of alleviating the typically heavy computational loads with the mode of offline training and online execution. As a result of the utilization of multi-layer neural networks in conjunction with the actor-critic framework, the constrained ADP method is capable of handling complex and multidimensional scenarios. Finally, various simulations have been carried out to show that the constrained ADP algorithm is effective.

引用

页码：5003 / 5013

页数：11

共 50 条

[21] A policy improvement method in constrained stochastic dynamic programming
Chang, Hyeong Soo
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (09) : 1523 - 1526
[22] Higher-order Nonlinear Discrete Approximate Iteration On the Continuing Dynamic Programming
Zhang, Peng
Zhou, Jingyi
ADVANCED RESEARCH ON INDUSTRY, INFORMATION SYSTEM AND MATERIAL ENGINEERING, 2012, 459 : 571 - +
[23] Autonomous Driving Dynamic-programming Algorithm Based on Improved Artificial Potential Field
Luo Y.-T.
Shi Z.-X.
Liang W.-Q.
Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2022, 35 (12): : 279 - 292
[24] Approximate dynamic programming based on expansive projections
Arruda, Edilson R.
do Val, Joao B. R.
PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5540 - +
[25] Explicit MPC based on Approximate Dynamic Programming
Bakarac, Peter
Holaza, Juraj
Kaluz, Martin
Klauco, Martin
Lofberg, Johan
Kvasnica, Michal
2018 EUROPEAN CONTROL CONFERENCE (ECC), 2018, : 1172 - 1177
[26] An approximate dynamic programming approach to a communication constrained sensor management problem
Williams, JL
Fisher, JW
Willsky, AS
2005 7TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), VOLS 1 AND 2, 2005, : 582 - 589
[27] Approximate dynamic programming for communication-constrained sensor network management
Williams, Jason L.
Fisher, John W., III
Willsky, Alan S.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (08) : 4300 - 4311
[28] Performance Guarantee of an Approximate Dynamic Programming Policy for Robotic Surveillance
Park, Myoungkuk
Kalyanam, Krishnamoorthy
Darbha, Swaroop
Khargonekar, Pramod P.
Pachter, Meir
Chandler, Phillip R.
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2016, 13 (02) : 564 - 578
[29] An Approximate Dynamic Programming Approach for Path Following Control of an Autonomous Vehicle
Zhao, Kun
Wang, Jian
Xu, Xin
Huang, Zhenhua
2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1998 - 2004
[30] Cooperative Navigation for Heterogeneous Autonomous Vehicles via Approximate Dynamic Programming
Ferrari, Silvia
Anderson, Michael
Fierro, Rafael
Lu, Wenjie
2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 121 - 127

← 1 2 3 4 5 →