Effective Cross-Region Courier-Displacement for Instant Delivery via Reinforcement Learning

被引:7
|
作者
Hu, Shijie [1 ]
Guo, Baoshen [1 ]
Wang, Shuai [1 ]
Zhou, Xiaolei [1 ,2 ]
机构
[1] Southeast Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China
[2] Natl Univ Def Technol, Res Inst 63, Zunyi, Guizhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Courier displacement; Reinforcement learning; Instant delivery;
D O I
10.1007/978-3-030-85928-2_23
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of mobile phones and the Internet of Things, instant delivery services (e.g., UberEats and MeiTuan) have become a popular choice for people to order foods, fruits, and other groceries online, especially after the impact of COVID-19. In instant delivery services, it is important to dispatch massive orders to limited couriers, especially in rush hours. To meet this need, an efficient courier displacement mechanism not only can balance the demand (picking up orders) and supply (couriers' capacity) but also improve the efficiency of order delivery by reducing idle displacing time. Existing studies on fleet management of rider-sharing or bike rebalancing cannot apply to courier displacement problems in instant delivery due to unique practical factors of instant delivery including region difference and strict delivery time constraints. In this work, we propose an efficient cross-region courier displacement method Courier Displacement Reinforcement Learning (short for CDRL), based on multi-agent actor-critic, considering the dynamic demand and supply at the region level and strict time constraints. Specifically, the multi-agent actor-critic reinforcement learning-based courier displacement framework utilizes a policy network to generate displacement decisions considering multiple practical factors and designs a value network to evaluate decisions of the policy network. One month of real-world order records data-set of Shanghai collecting from Eleme (i.e., one of the biggest instant delivery services in China) are utilized in the evaluation and the results show that our method offering up to 36% increase in courier displacement performance and reduce idle ride time by 17%.
引用
收藏
页码:288 / 300
页数:13
相关论文
共 47 条
  • [1] Cross-Region Courier Displacement for On-Demand Delivery With Multi-Agent Reinforcement Learning
    Wang, Shuai
    Hu, Shijie
    Guo, Baoshen
    Wang, Guang
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (05) : 1321 - 1333
  • [2] Courier routing and assignment for food delivery service using reinforcement learning
    Bozanta, Aysun
    Cevik, Mucahit
    Kavaklioglu, Can
    Kavuk, Eray M.
    Tosun, Ayse
    Sonuc, Sibel B.
    Duranel, Alper
    Basar, Ayse
    Computers and Industrial Engineering, 2022, 164
  • [3] Courier routing and assignment for food delivery service using reinforcement learning
    Bozanta, Aysun
    Cevik, Mucahit
    Kavaklioglu, Can
    Kavuk, Eray M.
    Tosun, Ayse
    Sonuc, Sibel B.
    Duranel, Alper
    Basar, Ayse
    COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 164
  • [4] Unsupervised domain adaptation for semantic segmentation via cross-region alignment
    Wang, Zhijie
    Liu, Xing
    Suganuma, Masanori
    Okatani, Takayuki
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 234
  • [5] Deep learning for cross-region streamflow and flood forecasting at a global scale
    Zhang, Binlan
    Ouyang, Chaojun
    Cui, Peng
    Xu, Qingsong
    Wang, Dongpo
    Zhang, Fei
    Li, Zhong
    Fan, Linfeng
    Lovati, Marco
    Liu, Yanling
    Zhang, Qianqian
    INNOVATION, 2024, 5 (03):
  • [6] Reinforcement learning enabled dynamic bidding strategy for instant delivery trading
    Guo, Chaojie
    Thompson, Russell G.
    Foliente, Greg
    Peng, Xiaoshuai
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 160
  • [7] DeepSpotCloud: Leveraging Cross-Region GPU Spot Instances for Deep Learning
    Lee, Kyungyong
    Son, Myungjun
    2017 IEEE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2017, : 98 - 105
  • [8] Impact Analysis and Application of Cross-region HVDC Delivery Mode in Renewable Energy Accommodation
    Zhang Z.
    Wang W.
    Wang Z.
    Ma X.
    Chu Y.
    Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2019, 43 (11): : 174 - 180
  • [9] Effective Program Debloating via Reinforcement Learning
    Heo, Kihong
    Lee, Woosuk
    Pashakhanloo, Pardis
    Naik, Mayur
    PROCEEDINGS OF THE 2018 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'18), 2018, : 380 - 394
  • [10] Personalized Recommendations of Locally Interesting Venues to Tourists via Cross-Region Community Matching
    Zhao, Yi-Liang
    Nie, Liqiang
    Wang, Xiangyu
    Chua, Tat-Seng
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2014, 5 (03)