Effective Cross-Region Courier-Displacement for Instant Delivery via Reinforcement Learning

被引：7

作者：

Hu, Shijie ^{[1
]}

Guo, Baoshen ^{[1
]}

Wang, Shuai ^{[1
]}

Zhou, Xiaolei ^{[1
,2
]}

机构：

[1] Southeast Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China

[2] Natl Univ Def Technol, Res Inst 63, Zunyi, Guizhou, Peoples R China

来源：

WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I | 2021年 / 12937卷

基金：

中国国家自然科学基金;

关键词：

Courier displacement; Reinforcement learning; Instant delivery;

D O I：

10.1007/978-3-030-85928-2_23

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid development of mobile phones and the Internet of Things, instant delivery services (e.g., UberEats and MeiTuan) have become a popular choice for people to order foods, fruits, and other groceries online, especially after the impact of COVID-19. In instant delivery services, it is important to dispatch massive orders to limited couriers, especially in rush hours. To meet this need, an efficient courier displacement mechanism not only can balance the demand (picking up orders) and supply (couriers' capacity) but also improve the efficiency of order delivery by reducing idle displacing time. Existing studies on fleet management of rider-sharing or bike rebalancing cannot apply to courier displacement problems in instant delivery due to unique practical factors of instant delivery including region difference and strict delivery time constraints. In this work, we propose an efficient cross-region courier displacement method Courier Displacement Reinforcement Learning (short for CDRL), based on multi-agent actor-critic, considering the dynamic demand and supply at the region level and strict time constraints. Specifically, the multi-agent actor-critic reinforcement learning-based courier displacement framework utilizes a policy network to generate displacement decisions considering multiple practical factors and designs a value network to evaluate decisions of the policy network. One month of real-world order records data-set of Shanghai collecting from Eleme (i.e., one of the biggest instant delivery services in China) are utilized in the evaluation and the results show that our method offering up to 36% increase in courier displacement performance and reduce idle ride time by 17%.

引用

页码：288 / 300

页数：13

共 47 条

[1] Cross-Region Courier Displacement for On-Demand Delivery With Multi-Agent Reinforcement Learning
Wang, Shuai
Hu, Shijie
Guo, Baoshen
Wang, Guang
IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (05) : 1321 - 1333
[2] Courier routing and assignment for food delivery service using reinforcement learning
Bozanta, Aysun
Cevik, Mucahit
Kavaklioglu, Can
Kavuk, Eray M.
Tosun, Ayse
Sonuc, Sibel B.
Duranel, Alper
Basar, Ayse
Computers and Industrial Engineering, 2022, 164
[3] Courier routing and assignment for food delivery service using reinforcement learning
Bozanta, Aysun
Cevik, Mucahit
Kavaklioglu, Can
Kavuk, Eray M.
Tosun, Ayse
Sonuc, Sibel B.
Duranel, Alper
Basar, Ayse
COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 164
[4] Unsupervised domain adaptation for semantic segmentation via cross-region alignment
Wang, Zhijie
Liu, Xing
Suganuma, Masanori
Okatani, Takayuki
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 234
[5] Deep learning for cross-region streamflow and flood forecasting at a global scale
Zhang, Binlan
Ouyang, Chaojun
Cui, Peng
Xu, Qingsong
Wang, Dongpo
Zhang, Fei
Li, Zhong
Fan, Linfeng
Lovati, Marco
Liu, Yanling
Zhang, Qianqian
INNOVATION, 2024, 5 (03):
[6] Reinforcement learning enabled dynamic bidding strategy for instant delivery trading
Guo, Chaojie
Thompson, Russell G.
Foliente, Greg
Peng, Xiaoshuai
COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 160
[7] DeepSpotCloud: Leveraging Cross-Region GPU Spot Instances for Deep Learning
Lee, Kyungyong
Son, Myungjun
2017 IEEE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2017, : 98 - 105
[8] Impact Analysis and Application of Cross-region HVDC Delivery Mode in Renewable Energy Accommodation
Zhang Z.
Wang W.
Wang Z.
Ma X.
Chu Y.
Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2019, 43 (11): : 174 - 180
[9] Effective Program Debloating via Reinforcement Learning
Heo, Kihong
Lee, Woosuk
Pashakhanloo, Pardis
Naik, Mayur
PROCEEDINGS OF THE 2018 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'18), 2018, : 380 - 394
[10] Personalized Recommendations of Locally Interesting Venues to Tourists via Cross-Region Community Matching
Zhao, Yi-Liang
Nie, Liqiang
Wang, Xiangyu
Chua, Tat-Seng
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2014, 5 (03)

← 1 2 3 4 5 →