Multi-Agent Constrained Policy Optimization for Conflict-Free Management of Connected Autonomous Vehicles at Unsignalized Intersections

被引:3
|
作者
Zhao, Rui [1 ]
Li, Yun [2 ]
Gao, Fei [3 ]
Gao, Zhenhai [3 ]
Zhang, Tianyao [3 ]
机构
[1] Jilin Univ, Coll Automot Engn, Changchun 130025, Peoples R China
[2] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138654, Japan
[3] Jilin Univ, State Key Lab Automot Simulat & Control, Changchun 130025, Peoples R China
基金
中国国家自然科学基金;
关键词
Safety; Computational efficiency; Trajectory; Autonomous vehicles; Roads; Collaboration; Vehicle dynamics; Conflict-free management; connected autono-mous vehicles; safety reinforcement learning; multi-agent constrained policy optimization; unsignalized intersections; AUTOMATED VEHICLES; SYSTEM;
D O I
10.1109/TITS.2023.3331723
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Autonomous Intersection Management (AIM) systems present a new paradigm for conflict-free cooperation of connected autonomous vehicles (CAVs) at road intersections, the aim of which is to eliminate collisions and improve the traffic efficiency and ride comfort. Given the challenges of current centralized coordination methods in balancing high computational efficiency and robust safety assurance, this paper proposes an innovative conflict-free management scheme for CAVs at unsignalized intersections, leveraging safe multi-agent deep reinforcement learning (MADRL). Firstly, we formulate the safe MADRL problem as a constrained Markov game (CMG) and then transform the AIM problem into a CMG by carefully designing state, action, reward, and cost functions. Subsequently, we propose the Multi-Agent Constrained Policy Optimization (MACPO), specifically tailored to solve the CMG problem. MACPO incorporates safety constraints that further restrict the trust region formed by the Kullback-Leibler (KL) divergence, facilitating reinforcement learning policy updates that maximize performance while keeping constraint costs within their limit bounds. This leads us to introduce the MACPO-based AIM Algorithm. Finally, we train an AIM policy and compare its computation time, ride comfort, traffic efficiency, and safety with management schemes based on Model Predictive Control (MPC), Mixed Integer Programming (MIP), and non-safety-aware reinforcement learning. According to the results, compared with the MPC and MIP methods, our method has increased computational efficiency by 65.22 times and 731.52 times respectively, and has improved traffic efficiency by 2.41 times and 1.80 times respectively. In contrast to the non-safety awareness RL methods, our method achieves a zero collision rate for the first time, while also enhancing ride comfort, highlighting the advantages of using MACPO.
引用
收藏
页码:5374 / 5388
页数:15
相关论文
共 48 条
  • [31] Modeling Interactions of Autonomous/Manual Vehicles and Pedestrians with a Multi-Agent Deep Deterministic Policy Gradient
    Hu, Weichao
    Mu, Hongzhang
    Chen, Yanyan
    Liu, Yixin
    Li, Xiaosong
    SUSTAINABILITY, 2023, 15 (07)
  • [32] An Improved Acceleration Method Based on Multi-Agent System for AGVs Conflict-Free Path Planning in Automated Terminals
    Guo, Kunlun
    Zhu, Jin
    Shen, Lei
    IEEE ACCESS, 2021, 9 : 3326 - 3338
  • [33] Real-time multi-agent fleet management strategy for autonomous underground mines vehicles
    Gamache, M.
    Basilico, G.
    Frayret, J. -M.
    Riopel, D.
    INTERNATIONAL JOURNAL OF MINING RECLAMATION AND ENVIRONMENT, 2023, 37 (09) : 649 - 666
  • [34] Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios
    Zhang, Zhili
    Han, Songyang
    Wang, Jiangwei
    Miao, Fei
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5574 - 5580
  • [35] Safe and efficient manoeuvring for emergency vehicles in autonomous traffic using multi-agent proximal policy optimisation
    Parada, Leandro
    Candela, Eduardo
    Marques, Luis
    Angeloudis, Panagiotis
    TRANSPORTMETRICA A-TRANSPORT SCIENCE, 2023,
  • [36] Sequential game solution for lane-merging conflict between autonomous vehicles A multi-agent reinforcement learning approach
    Wu, Yang
    Zhang, Zhiyong
    Yuan, Jianhua
    Ma, Qing
    Gao, Lifen
    2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2016, : 1482 - 1488
  • [37] Multi-Agent Reinforcement Learning with Information-sharing Constrained Policy Optimization for Global Cost Environment
    Okawa, Yoshihiro
    Dan, Hayato
    Morita, Natsuki
    Ogawa, Masatoshi
    IFAC PAPERSONLINE, 2023, 56 (02): : 1558 - 1565
  • [38] A Privacy-Preserving-Based Distributed Collaborative Scheme for Connected Autonomous Vehicles at Multi-Lane Signal-Free Intersections
    Zhao, Yuan
    Gong, Dekui
    Wen, Shixi
    Ding, Lei
    Guo, Ge
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (07) : 6824 - 6835
  • [39] MULTI-AGENT BASED SOLUTION FOR FREE FLIGHT CONFLICT DETECTION AND RESOLUTION USING PARTICLE SWARM OPTIMIZATION ALGORITHM
    Emamil, Hojjat
    Derakhshan, Farnaz
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2014, 76 (03): : 49 - 64
  • [40] Multi-agent based solution for free flight conflict detection and resolution using particle swarm optimization algorithm
    Emami, Hojjat
    Derakhshan, Farnaz
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2014, 76 (03): : 49 - 64