Combining deep reinforcement learning with heuristics to solve the traveling salesman problem

被引:0
|
作者
Hong, Li [1 ]
Liu, Yu [1 ]
Xu, Mengqiao [2 ]
Deng, Wenhui [2 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian1 16620, Peoples R China
[2] Dalian Univ Technol, Sch Econ & Management, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
traveling salesman problem; deep reinforcement learning; simulated annealing algorithm; transformer model; whale optimization algorithm; 87.55.kd; 87.55.de; 07.05.Mh; ALGORITHM;
D O I
10.1088/1674-1056/ad95f1
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Recent studies employing deep learning to solve the traveling salesman problem (TSP) have mainly focused on learning construction heuristics. Such methods can improve TSP solutions, but still depend on additional programs. However, methods that focus on learning improvement heuristics to iteratively refine solutions remain insufficient. Traditional improvement heuristics are guided by a manually designed search strategy and may only achieve limited improvements. This paper proposes a novel framework for learning improvement heuristics, which automatically discovers better improvement policies for heuristics to iteratively solve the TSP. Our framework first designs a new architecture based on a transformer model to make the policy network parameterized, which introduces an action-dropout layer to prevent action selection from overfitting. It then proposes a deep reinforcement learning approach integrating a simulated annealing mechanism (named RL-SA) to learn the pairwise selected policy, aiming to improve the 2-opt algorithm's performance. The RL-SA leverages the whale optimization algorithm to generate initial solutions for better sampling efficiency and uses the Gaussian perturbation strategy to tackle the sparse reward problem of reinforcement learning. The experiment results show that the proposed approach is significantly superior to the state-of-the-art learning-based methods, and further reduces the gap between learning-based methods and highly optimized solvers in the benchmark datasets. Moreover, our pre-trained model M can be applied to guide the SA algorithm (named M-SA (ours)), which performs better than existing deep models in small-, medium-, and large-scale TSPLIB datasets. Additionally, the M-SA (ours) achieves excellent generalization performance in a real-world dataset on global liner shipping routes, with the optimization percentages in distance reduction ranging from 3.52% to 17.99%.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Application of the agamogenetic algorithm to solve the traveling salesman problem
    Zhang, Yinghui
    Wang, Zhiwei
    Zeng, Qinghua
    Yang, Haolei
    Wang, Zhihua
    BIO-INSPIRED COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2007, 4688 : 135 - 143
  • [42] GEOMETRIC APPROACHES TO SOLVE THE CHEBYSHEV TRAVELING SALESMAN PROBLEM
    BOZER, YA
    SCHORN, EC
    SHARP, GP
    IIE TRANSACTIONS, 1990, 22 (03) : 238 - 254
  • [43] Artificial Fish Algorithm to solve Traveling Salesman Problem
    Wang, Jian-Ping
    Liu, Yan-Pei
    Huang, Yong
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE II, PTS 1-6, 2012, 121-126 : 4410 - 4414
  • [44] Hard to solve instances of the Euclidean Traveling Salesman Problem
    Stefan Hougardy
    Xianghui Zhong
    Mathematical Programming Computation, 2021, 13 : 51 - 74
  • [45] Comparison of Heuristics for Resolving the Traveling Salesman Problem with Information Technology
    Gong, Wei
    Li, Mei
    ADVANCED RESEARCH ON MATERIAL SCIENCE, ENVIROMENT SCIENCE AND COMPUTER SCIENCE III, 2014, 886 : 593 - +
  • [46] G-DGANet: Gated deep graph attention network with reinforcement learning for solving traveling salesman problem
    Fellek, Getu
    Farid, Ahmed
    Fujimura, Shigeru
    Yoshie, Osamu
    Gebreyesus, Goytom
    Neurocomputing, 2024, 579
  • [47] G-DGANet: Gated deep graph attention network with reinforcement learning for solving traveling salesman problem
    Fellek, Getu
    Farid, Ahmed
    Fujimura, Shigeru
    Yoshie, Osamu
    Gebreyesus, Goytom
    NEUROCOMPUTING, 2024, 579
  • [48] The Dynamic Traveling Salesman Problem with Time-Dependent and Stochastic travel times: A deep reinforcement learning approach
    Chen, Dawei
    Imdahl, Christina
    Lai, David
    Van Woensel, Tom
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2025, 172
  • [49] NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem
    Xin, Liang
    Song, Wen
    Cao, Zhiguang
    Zhang, Jie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Modified Local Search Heuristics for the Symmetric Traveling Salesman Problem
    Misevicius, Alfonsas
    Blazinskas, Andrius
    Lenkevicius, Antanas
    INFORMATION TECHNOLOGY AND CONTROL, 2013, 42 (03): : 217 - 230