DRL Router: Distributional Reinforcement Learning-Based Router for Reliable Shortest Path Problems

被引：4

作者：

Guo, Hongliang ^{[1
]}

Sheng, Wenda ^{[2
]}

Gao, Chen ^{[3
]}

Jin, Yaochu ^{[4
]}

机构：

[1] Sichuan Univ SCU, Coll Comp Sci, Chengdu 610065, Sichuan, Peoples R China

[2] Univ Elect Sci & Technol China, Chengdu 611731, Peoples R China

[3] Swiss Fed Inst Technol, ZH-8092 Zurich, Switzerland

[4] Bielefeld Univ, D-33619 Bielefeld, Germany

来源：

IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE | 2023年 / 15卷 / 05期

关键词：

Transportation; Reliability; Planning; Decision making; Routing; Navigation; Bibliographies; TRAVEL-TIME; STOCHASTIC NETWORKS; ALGORITHM; PROBABILITY;

D O I：

10.1109/MITS.2023.3265309

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article studies reliable shortest path (RSP) problems in stochastic transportation networks. The term reliability in the RSP literature has many definitions, e.g., 1) maximal stochastic on-time arrival probability, 2) minimal travel time with a high-confidence constraint, 3) minimal mean and standard deviation combination, and 4) minimal expected disutility. To the best of our knowledge, almost all state-of-the-art RSP solutions are designed to target one specific RSP objective, and it is very difficult, if not impossible, to adapt them to other RSP objectives. To bridge the gap, this article develops a distributional reinforcement learning (DRL)-based algorithm, namely, DRL-Router, which serves as a universal solution to the four aforementioned RSP problems. DRL-Router employs the DRL method to approximate the full travel time distribution of a given routing policy and then makes improvements with respect to the user-defined RSP objective through a generalized policy iteration scheme. DRL-Router is 1) universal, i.e., it is applicable to a variety of RSP objectives; 2) model free, i.e., it does not rely on well calibrated travel time distribution models; 3) it is adaptive with navigation objective changes; and 4) fast, i.e., it performs real-time decision making. Extensive experimental results and comparisons with baseline algorithms in various transportation networks justify both the accuracy and efficiency of DRL-Router.

引用

页码：91 / 108

页数：18

共 50 条

[41] Reinforcement learning-based dynamic obstacle avoidance and integration of path planning
Jaewan Choi
Geonhee Lee
Chibum Lee
Intelligent Service Robotics, 2021, 14 : 663 - 677
[42] ParaDiMe: A Distributed Memory FPGA Router Based on Speculative Parallelism and Path Encoding
Hoo, Chin Hau
Kumar, Akash
2017 IEEE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2017), 2017, : 172 - 179
[43] Deep Reinforcement Learning-Based Enhancement of SATMAC for Reliable Channel Access in VANETs
Wu, Jingbang
Yu, Ye
Guo, Yihan
Zhou, Shufen
2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 109 - 113
[44] DRL-OR: Deep Reinforcement Learning-based Online Routing for Multi-type Service Requirements
Liu, Chenyi
Xu, Mingwei
Yang, Yuan
Geng, Nan
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[45] DRL-Tomo: a deep reinforcement learning-based approach to augmented data generation for network tomography
Hou, Changsheng
Hou, Bingnan
Li, Xionglve
Zhou, Tongqing
Chen, Yingwen
Cai, Zhiping
COMPUTER JOURNAL, 2024, 67 (10): : 2995 - 3008
[46] An analysis of a router-based loss detection service for active reliable multicast protocols
Maimour, M
Pham, CD
10TH IEEE INTERNATIONAL CONFERENCE ON NETWORKS (ICON 2002), PROCEEDINGS, 2002, : 49 - 56
[47] Accurate Machine-Learning-Based On-Chip Router Modeling
Jeong, Kwangok
Kahng, Andrew B.
Lin, Bill
Samadi, Kambiz
IEEE EMBEDDED SYSTEMS LETTERS, 2010, 2 (03) : 62 - 66
[48] Real-time local path planning strategy based on deep distributional reinforcement learning
Du, Shengli
Zhu, Zexing
Wang, Xuefang
Han, Honggui
Qiao, Junfei
NEUROCOMPUTING, 2024, 599
[49] Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
Sohn, Sungryull
Lee, Sungtae
Choi, Jongwook
van Seijen, Harm
Fatemi, Mehdi
Lee, Honglak
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[50] Significant Sampling for Shortest Path Routing: A Deep Reinforcement Learning Solution
Shao, Yulin
Rezaee, Arman
Liew, Soung Chang
Chan, Vincent W. S.
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2020, 38 (10) : 2234 - 2248

← 1 2 3 4 5 →