Multi-Agent Constrained Policy Optimization for Conflict-Free Management of Connected Autonomous Vehicles at Unsignalized Intersections

被引:3
|
作者
Zhao, Rui [1 ]
Li, Yun [2 ]
Gao, Fei [3 ]
Gao, Zhenhai [3 ]
Zhang, Tianyao [3 ]
机构
[1] Jilin Univ, Coll Automot Engn, Changchun 130025, Peoples R China
[2] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138654, Japan
[3] Jilin Univ, State Key Lab Automot Simulat & Control, Changchun 130025, Peoples R China
基金
中国国家自然科学基金;
关键词
Safety; Computational efficiency; Trajectory; Autonomous vehicles; Roads; Collaboration; Vehicle dynamics; Conflict-free management; connected autono-mous vehicles; safety reinforcement learning; multi-agent constrained policy optimization; unsignalized intersections; AUTOMATED VEHICLES; SYSTEM;
D O I
10.1109/TITS.2023.3331723
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Autonomous Intersection Management (AIM) systems present a new paradigm for conflict-free cooperation of connected autonomous vehicles (CAVs) at road intersections, the aim of which is to eliminate collisions and improve the traffic efficiency and ride comfort. Given the challenges of current centralized coordination methods in balancing high computational efficiency and robust safety assurance, this paper proposes an innovative conflict-free management scheme for CAVs at unsignalized intersections, leveraging safe multi-agent deep reinforcement learning (MADRL). Firstly, we formulate the safe MADRL problem as a constrained Markov game (CMG) and then transform the AIM problem into a CMG by carefully designing state, action, reward, and cost functions. Subsequently, we propose the Multi-Agent Constrained Policy Optimization (MACPO), specifically tailored to solve the CMG problem. MACPO incorporates safety constraints that further restrict the trust region formed by the Kullback-Leibler (KL) divergence, facilitating reinforcement learning policy updates that maximize performance while keeping constraint costs within their limit bounds. This leads us to introduce the MACPO-based AIM Algorithm. Finally, we train an AIM policy and compare its computation time, ride comfort, traffic efficiency, and safety with management schemes based on Model Predictive Control (MPC), Mixed Integer Programming (MIP), and non-safety-aware reinforcement learning. According to the results, compared with the MPC and MIP methods, our method has increased computational efficiency by 65.22 times and 731.52 times respectively, and has improved traffic efficiency by 2.41 times and 1.80 times respectively. In contrast to the non-safety awareness RL methods, our method achieves a zero collision rate for the first time, while also enhancing ride comfort, highlighting the advantages of using MACPO.
引用
收藏
页码:5374 / 5388
页数:15
相关论文
共 48 条
  • [21] Conflict-free routing scheduling of OHTs based on multi-agent intelligent control system framework
    Zhou, Bing-Hai
    Wang, Zhu
    Zheng, Wen
    Journal of Donghua University (English Edition), 2012, 29 (06) : 484 - 488
  • [22] Conflict-Free Routing Scheduling of OHTs Based on Multi-agent Intelligent Control System Framework
    周炳海
    王翥
    郑雯
    JournalofDonghuaUniversity(EnglishEdition), 2012, 29 (06) : 484 - 488
  • [23] Multi-Agent Intersection Management for Connected Vehicles using an Optimal Scheduling Approach
    Jin, Qiu
    Wu, Guoyuan
    Boriboonsomsin, Kanok
    Barth, Matthew
    2012 INTERNATIONAL CONFERENCE ON CONNECTED VEHICLES AND EXPO (ICCVE), 2012, : 185 - 190
  • [24] Advanced Intersection Management for Connected Vehicles Using a Multi-Agent Systems Approach
    Jin, Qiu
    Wu, Guoyuan
    Boriboonsomsin, Kanok
    Barth, Matthew
    2012 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2012, : 932 - 937
  • [25] Autonomous Intersection Management with Heterogeneous Vehicles: A Multi-Agent Reinforcement Learning Approach
    Chen, Kaixin
    Li, Bing
    Zhang, Rongqing
    Cheng, Xiang
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 2255 - 2260
  • [26] Cooperative adaptive optimal output regulation of multi-agent systems with an application to connected and autonomous vehicles
    Dong, Yuchen
    Gao, Weinan
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS 2024, 2024,
  • [27] Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles
    Chen, Sikai
    Dong, Jiqian
    Ha, Paul
    Li, Yujie
    Labi, Samuel
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2021, 36 (07) : 838 - 857
  • [28] Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic
    Zhou W.
    Chen D.
    Yan J.
    Li Z.
    Yin H.
    Ge W.
    Autonomous Intelligent Systems, 2022, 2 (01):
  • [29] A Multi-Agent Reinforcement Learning Approach for Safe and Efficient Behavior Planning of Connected Autonomous Vehicles
    Han, Songyang
    Zhou, Shanglin
    Wang, Jiangwei
    Pepin, Lynn
    Ding, Caiwen
    Fu, Jie
    Miao, Fei
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 3654 - 3670
  • [30] Network-Flow-Problem-Based Approach to Multi-Agent Path Finding for Connected Autonomous Vehicles
    Okoso, Ayano
    Okumura, Bunyo
    Otaki, Keisuke
    Nishi, Tomoki
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 1946 - 1953