Hyper-heuristic for CVRP with reinforcement learning

被引:0
|
作者
Zhang J. [1 ]
Feng Q. [1 ]
Zhao Y. [1 ]
Liu J. [1 ]
Leng L. [1 ]
机构
[1] Key Laboratory of Special Equipment Manufacturing and Advanced Processing Technology, Ministry of Education, Zhejiang University of Technology, Hangzhou
关键词
Deep Q neural network; Hyper-heuristic algorithm; Reinforcement learning; Vehicle routing problem;
D O I
10.13196/j.cims.2020.04.025
中图分类号
学科分类号
摘要
To reduce the situation of falling into local optimum and solve the capacitated vehicle routing problem, a hyper-heuristic algorithm based on reinforcement learning was. A high-level heuristic strategy was designed, which included selection strategy and acceptance criteria. Based on the learning mechanism, the deep Q neural network algorithm in reinforcement learning was used to construct the selection strategy, and evaluate the performance of the underlying operator with rewards and punishments; Rewards and punishments as well as simulated annealing was used as the acceptance criteria, and a sequence pool was constructed for high-quality solutions, so as to guide the algorithm searching effectively. Also, the clustering method was used to improve the quality of the initial solution. The optimal value was analyzed, error rate and average value were compared with other algorithms. The experimental results show that the proposed algorithm was effect and stable in solving the problem, and the overall solution effect was better than the comparison algorithm. © 2020, Editorial Department of CIMS. All right reserved.
引用
收藏
页码:1118 / 1129
页数:11
相关论文
共 32 条
  • [21] Choong S.S., Wong L.P., Lim C.P., Automatic design of hyper-heuristic based on reinforcement learning, Information Sciences, 436-437, pp. 89-107, (2018)
  • [22] Zhao Y., Peng D., Zhang J., Et al., Quantum evolutionary algorithm for capacitated vehicle routing problem, Systems Engineering-Theory & Practice, 29, 2, pp. 159-166, (2009)
  • [23] Beasley J.E., Route first-cluster second methods for vehicle routing, Omega, 11, 4, pp. 403-408, (1983)
  • [24] Volodymyr M., Koray K., David S., Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, (2015)
  • [25] Lecuny, Bengio Y., Hinton G., Deep learning., Nature, 521, 7553, (2015)
  • [26] Zhang J., Liu J., Zhao Y., Et al., Hyper-heuristic for time-dependent VRP with simultaneous deliveryand pickup, Computer Integrated Manufacturing Systems, pp. 1-19
  • [27] Toffolo T.A.M., Vidal T., Wauters T., Heuristics for vehicle routing problems: Sequence or set optimization?, Computers & Operations Research, 105, pp. 118-131, (2019)
  • [28] Leng L., Zhao Y., Zhang C., Et al., Quantum-inspired hyper-heuristics for low-carbon location-routing problem with simultaneous pickup and delivery, Computer Integrated Manufacturing Systems, 22, 1, pp. 1-22, (2018)
  • [29] Li Y., Fan H., Hybrid variable neighborhood symbiotic organisms search for capacitatedvehicle routing problem, Control and Decision, 33, 7, pp. 41-49, (2018)
  • [30] Chao G., Hu R., Qian B., Et al., Effective hybrid quantum evolutionary algorithm for capacitated vehicle problem, Computer Integrated Manufacturing Systems, 21, 4, pp. 1101-1113, (2015)