Combining deep reinforcement learning with heuristics to solve the traveling salesman problem

被引:0
|
作者
洪莉 [1 ]
刘宇 [1 ]
徐梦俏 [2 ]
邓文慧 [2 ]
机构
[1] School of Software Technology, Dalian University of Technology
[2] School of Economics and Management, Dalian University of
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies employing deep learning to solve the traveling salesman problem(TSP) have mainly focused on learning construction heuristics. Such methods can improve TSP solutions, but still depend on additional programs. However,methods that focus on learning improvement heuristics to iteratively refine solutions remain insufficient. Traditional improvement heuristics are guided by a manually designed search strategy and may only achieve limited improvements. This paper proposes a novel framework for learning improvement heuristics, which automatically discovers better improvement policies for heuristics to iteratively solve the TSP. Our framework first designs a new architecture based on a transformer model to make the policy network parameterized, which introduces an action-dropout layer to prevent action selection from overfitting. It then proposes a deep reinforcement learning approach integrating a simulated annealing mechanism(named RL-SA) to learn the pairwise selected policy, aiming to improve the 2-opt algorithm's performance. The RL-SA leverages the whale optimization algorithm to generate initial solutions for better sampling efficiency and uses the Gaussian perturbation strategy to tackle the sparse reward problem of reinforcement learning. The experiment results show that the proposed approach is significantly superior to the state-of-the-art learning-based methods, and further reduces the gap between learning-based methods and highly optimized solvers in the benchmark datasets. Moreover, our pre-trained model M can be applied to guide the SA algorithm(named M-SA(ours)), which performs better than existing deep models in small-,medium-, and large-scale TSPLIB datasets. Additionally, the M-SA(ours) achieves excellent generalization performance in a real-world dataset on global liner shipping routes, with the optimization percentages in distance reduction ranging from3.52% to 17.99%.
引用
收藏
页码:100 / 110
页数:11
相关论文
共 50 条
  • [21] A new approach to solve the traveling salesman problem
    Siqueira, Paulo Henrique
    Arns Steiner, Maria Teresinha
    Scheer, Sergio
    NEUROCOMPUTING, 2007, 70 (4-6) : 1013 - 1021
  • [22] SMART ANTS SOLVE TRAVELING SALESMAN PROBLEM
    ARTHUR, C
    NEW SCIENTIST, 1994, 142 (1928) : 6 - 6
  • [23] GatedGCN with GraphSage to Solve Traveling Salesman Problem
    Yang, Hua
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IV, 2023, 14257 : 377 - 387
  • [24] Experimental analysis of heuristics for the bottleneck traveling salesman problem
    LaRusic, John
    Punnen, Abraham P.
    Aubanel, Eric
    JOURNAL OF HEURISTICS, 2012, 18 (03) : 473 - 503
  • [25] Experimental analysis of heuristics for the bottleneck traveling salesman problem
    John LaRusic
    Abraham P. Punnen
    Eric Aubanel
    Journal of Heuristics, 2012, 18 : 473 - 503
  • [26] HEURISTICS AND BOUNDS FOR THE TRAVELING SALESMAN LOCATION PROBLEM ON THE PLANE
    SIMCHILEVI, D
    BERMAN, O
    OPERATIONS RESEARCH LETTERS, 1987, 6 (05) : 243 - 248
  • [27] Dynamics of local search heuristics for the traveling salesman problem
    Li, WQ
    Alidaee, B
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2002, 32 (02): : 173 - 184
  • [28] Heuristics for the plate-cutting traveling salesman problem
    Hoeft, J
    Palekar, US
    IIE TRANSACTIONS, 1997, 29 (09) : 719 - 731
  • [29] Domination analysis of some heuristics for the traveling salesman problem
    Punnen, A
    Kabadi, S
    DISCRETE APPLIED MATHEMATICS, 2002, 119 (1-2) : 117 - 128
  • [30] EXPERIMENTS WITH LOCAL SEARCH HEURISTICS FOR THE TRAVELING SALESMAN PROBLEM
    Misevicius, Alfonsas
    Blazinskas, Andrius
    INFORMATION TECHNOLOGIES' 2010, 2010, : 47 - +