Energy Constrained Multi-Agent Reinforcement Learning for Coverage Path Planning

被引:0
|
作者
Zhao, Chenyang [1 ]
Liu, Juan [2 ]
Yoon, Suk-Un [3 ]
Li, Xinde [1 ,4 ]
Li, Heqing [1 ]
Zhang, Zhentong [1 ,4 ]
机构
[1] Southeast Univ, Nanjing 210096, Peoples R China
[2] Samsung Elect China R&D Ctr, Nanjing 210012, Peoples R China
[3] Samsung Elect, Suwon 16677, Gyeonggi Do, South Korea
[4] Nanjing Ctr Appl Math, Nanjing 211135, Peoples R China
基金
中国国家自然科学基金;
关键词
NAVIGATION;
D O I
10.1109/IROS55552.2023.10341412
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For multi-agent area coverage path planning problem, existing researches regard it as a combination of Traveling Salesman Problem (TSP) and Coverage Path Planning (CPP). However, these approaches have disadvantages of poor observation ability in online phase and high computational cost in offline phase, making it difficult to be applied to energy-constrained Unmanned Aerial Vehicles (UAVs) and adjust strategy dynamically. In this paper, we decompose the task into two sub-problems: multi-agent path planning and sub-region CPP. We model the multi-agent path planning problem as a Collective Markov Decision Process (C-MDP), and design an Energy Constrained Multi-Agent Reinforcement Learning (ECMARL) algorithm based on the centralized training and distributed execution concept. Taking into account energy constraint of UAVs, the UAV propulsion power model is established to measure the energy consumption of UAVs, and load balancing strategy is applied to dynamically allocate target areas for each UAV. If the UAV is under energy-depleted situation, ECMARL can adjust the mission strategy in real time according to environmental information and energy storage conditions of other UAVs. When UAVs reach each sub-region of interest, Back-an-Forth Paths (BFPs) are adopted to solve CPP problem, which can ensure full coverage, optimality and complexity of the sub-problem. Comprehensive theoretical analysis and experiments demonstrate that ECMARL is superior to the traditional offline TSP-CPP strategy in terms of solution quality and computational time, and can effectively deal with the energy-constrained UAVs.
引用
收藏
页码:5590 / 5597
页数:8
相关论文
共 50 条
  • [31] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [32] Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning
    Qie, Han
    Shi, Dianxi
    Shen, Tianlong
    Xu, Xinhai
    Li, Yuan
    Wang, Liujing
    IEEE ACCESS, 2019, 7 : 146264 - 146272
  • [33] Graph Convolutional Multi-Agent Reinforcement Learning for UAV Coverage Control
    Dai, Anna
    Li, Rongpeng
    Zhaot, Zhifeng
    Zhang, Honggang
    2020 12TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2020, : 1106 - 1111
  • [34] Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning
    Diddigi, Raghuram Bharadwaj
    Reddy, D. Sai Koti
    Prabuchandran, K. J.
    Bhatnagar, Shalabh
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1931 - 1933
  • [35] Advantage Constrained Proximal Policy Optimization in Multi-Agent Reinforcement Learning
    Li, Weifan
    Zhu, Yuanheng
    Zhao, Dongbin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [36] DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning
    Yang, Zhaoxing
    Jin, Haiming
    Ding, Rong
    You, Haoyi
    Fan, Guiyun
    Wang, Xinbing
    Zhou, Chenghu
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10861 - 10870
  • [37] A Fault-Tolerant Multi-Agent Reinforcement Learning Framework for Unmanned Aerial Vehicles-Unmanned Ground Vehicle Coverage Path Planning
    Ramezani, Mahya
    Atashgah, M. A. Amiri
    Rezaee, Alireza
    DRONES, 2024, 8 (10)
  • [38] Collaborative Path Planning of Multiple Carrier-based Aircraft Based on Multi-agent Reinforcement Learning
    Shang, Zhihao
    Mao, Zhiqiang
    Zhang, Huachao
    Xu, Mingliang
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 512 - 517
  • [39] Multi-Agent Coverage Path Planning using a Swarm of Unmanned Aerial Vehicles
    Chethan, Ragala
    Kar, Indrani
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [40] Multi-Agent UAV Path Planning
    Marsh, L.
    Calbert, G.
    Tu, J.
    Gossink, D.
    Kwok, H.
    MODSIM 2005: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING, 2005, : 2188 - 2194