DRL Router: Distributional Reinforcement Learning-Based Router for Reliable Shortest Path Problems

被引:4
|
作者
Guo, Hongliang [1 ]
Sheng, Wenda [2 ]
Gao, Chen [3 ]
Jin, Yaochu [4 ]
机构
[1] Sichuan Univ SCU, Coll Comp Sci, Chengdu 610065, Sichuan, Peoples R China
[2] Univ Elect Sci & Technol China, Chengdu 611731, Peoples R China
[3] Swiss Fed Inst Technol, ZH-8092 Zurich, Switzerland
[4] Bielefeld Univ, D-33619 Bielefeld, Germany
关键词
Transportation; Reliability; Planning; Decision making; Routing; Navigation; Bibliographies; TRAVEL-TIME; STOCHASTIC NETWORKS; ALGORITHM; PROBABILITY;
D O I
10.1109/MITS.2023.3265309
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article studies reliable shortest path (RSP) problems in stochastic transportation networks. The term reliability in the RSP literature has many definitions, e.g., 1) maximal stochastic on-time arrival probability, 2) minimal travel time with a high-confidence constraint, 3) minimal mean and standard deviation combination, and 4) minimal expected disutility. To the best of our knowledge, almost all state-of-the-art RSP solutions are designed to target one specific RSP objective, and it is very difficult, if not impossible, to adapt them to other RSP objectives. To bridge the gap, this article develops a distributional reinforcement learning (DRL)-based algorithm, namely, DRL-Router, which serves as a universal solution to the four aforementioned RSP problems. DRL-Router employs the DRL method to approximate the full travel time distribution of a given routing policy and then makes improvements with respect to the user-defined RSP objective through a generalized policy iteration scheme. DRL-Router is 1) universal, i.e., it is applicable to a variety of RSP objectives; 2) model free, i.e., it does not rely on well calibrated travel time distribution models; 3) it is adaptive with navigation objective changes; and 4) fast, i.e., it performs real-time decision making. Extensive experimental results and comparisons with baseline algorithms in various transportation networks justify both the accuracy and efficiency of DRL-Router.
引用
收藏
页码:91 / 108
页数:18
相关论文
共 50 条
  • [1] An Enhanced Deep Reinforcement Learning-Based Global Router for VLSI Design
    Xu S.
    Yang L.
    Liu G.
    Wireless Communications and Mobile Computing, 2023, 2023
  • [2] Reinforcement Learning-based Auto-router considering Signal Integrity
    Kim, Minsu
    Park, Hyunwook
    Kim, Seongguk
    Son, Keeyoung
    Kim, Subin
    Son, Kyunjune
    Choi, Seonguk
    Park, Gapyeol
    Kim, Joungho
    2020 IEEE 29TH CONFERENCE ON ELECTRICAL PERFORMANCE OF ELECTRONIC PACKAGING AND SYSTEMS (EPEPS 2020), 2020,
  • [3] Classical and contemporary shortest path problems in road networks: Implementation and experimental analysis of the TRANSIMS router
    Barrett, C
    Bisset, K
    Jacob, R
    Konjevod, G
    Marathe, M
    ALGORITHMS-ESA 2002, PROCEEDINGS, 2002, 2461 : 126 - 138
  • [4] BPA-A parallel shortest path algorithm for cluster-router
    Zhang, Xiaoping
    Wu, Jianping
    Zhang, Ning
    Zhao, Youjian
    PROCEEDINGS OF THE 18TH IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2006, : 283 - +
  • [5] OVERCOMING VALUE OVERESTIMATION FOR DISTRIBUTIONAL REINFORCEMENT LEARNING-BASED PATH PLANNING WITH CONSERVATIVE CONSTRAINTS
    Gu, Yuwan
    Chu, Yongtao
    Meng, Fang
    Chen, Yan
    Lv, Jidong
    Xu, Shoukun
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2025, 40 (02): : 124 - 132
  • [6] Model-based reinforcement learning for router port queue configurations
    Kattepur A.
    David S.
    Mohalik S.K.
    Intelligent and Converged Networks, 2021, 2 (03): : 177 - 197
  • [7] GE-DDRL: Graph Embedding and Deep Distributional Reinforcement Learning for Reliable Shortest Path: A Universal and Scale Free Solution
    Guo, Hongliang
    Sheng, Wenda
    Zhou, Yingjie
    Chen, Yunping
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 12196 - 12214
  • [8] A scalable multistage packet switch for terabit IP router based on deflection routing and shortest path routing
    Morino, H
    Bao, TT
    Hoaison, N
    Aida, H
    Saito, T
    2002 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2002, : 2179 - 2185
  • [9] Reliable router based on signal transition time adjustment
    Zhang, Ying
    Jiang, Jianhui
    Li, Huawei
    Li, Xiaowei
    Tongji Daxue Xuebao/Journal of Tongji University, 2015, 43 (02): : 305 - 311
  • [10] Deep Reinforcement Learning for Router Selection in Network With Heavy Traffic
    Ding, Ruijin
    Xu, Yadong
    Gao, Feifei
    Shen, Xuemin
    Wu, Wen
    IEEE ACCESS, 2019, 7 : 37109 - 37120