Reinforcement Learning Heuristic Algorithm for Solving the Two-dimensional Strip Packing Problem

被引:0
|
作者
Yang M.-G. [1 ]
Chen M.-F. [1 ]
Yang S.-Y. [1 ]
Zhang D.-F. [1 ]
机构
[1] School of Informatics, Xiamen University, Xiamen
来源
Ruan Jian Xue Bao/Journal of Software | 2021年 / 32卷 / 12期
关键词
Heuristics; Hierarchical search; Pointer network; Reinforcement learning; Two-dimensional strip packing problem;
D O I
10.13328/j.cnki.jos.006161
中图分类号
学科分类号
摘要
The two-dimensional strip packing problem is a classic NP-hard combinatorial optimization problem, which has been widely used in daily life and industrial production. This study proposes a reinforcement learning heuristic algorithm for it. The reinforcement learning is used to provide an initial boxing sequence for the heuristic algorithm to effectively improve the heuristic cold start problem. The reinforcement learning model can perform self-driven learning, using only the value of the heuristically calculated solution as a reward signal to optimize the network, so that the network can learn a better packing sequence. A simplified version of the pointer network is used to decode the output boxing sequence. The model consists of an embedding layer, a decoder, and an attention mechanism. Actor-critic algorithm is used to train the model, which improves the efficiency of the model. The reinforcement learning heuristic algorithm is tested on 714 standard problem instances and 400 generated problem instances. Experimental results show that the proposed algorithm can effectively improve the heuristic cold start problem and outperform the state-of-the-art heuristics with much higher solution quality. © Copyright 2021, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:3684 / 3697
页数:13
相关论文
共 39 条
  • [1] Lodi A, Silvano M, Michele M., Two-dimensional packing problems: A survey, European Journal of Operational Research, 141, 2, pp. 241-252, (2002)
  • [2] Hifi M., Exact algorithms for the guillotine strip cutting/packing problem, Computers & Operations Research, 25, 11, pp. 925-940, (1998)
  • [3] Martello S, Michele M, Daniele V., An exact approach to the strip-packing problem, INFORMS Journal on Computing, 15, 3, pp. 310-319, (2003)
  • [4] Kenmochi M, Takashi I, Koji N, Et al., Exact algorithms for the two-dimensional strip packing problem with and without rotations, European Journal of Operational Research, 198, 1, pp. 73-83, (2009)
  • [5] Cote JF, Mauro DM, Manuel I., Combinatorial Benders' cuts for the strip packing problem, Operations Research, 62, 3, pp. 643-661, (2014)
  • [6] Baker BS, -Edward-GC, Ronald-LR. Orthogonal packings in two dimensions, SIAM Journal on Computing, 9, 4, pp. 846-855, (1980)
  • [7] Chazelle B., The bottom-left bin-packing heuristic: An efficient implementation, IEEE Trans. on Computers, 8, pp. 697-707, (1983)
  • [8] Huang WQ, Liu JF., A deterministic heuristic algorithm based on euclidian distance for solving the rectangles packing, Chinese Journal of Computers, 29, 5, pp. 734-739, (2006)
  • [9] Burke EK, Graham K, Glenn W., A new placement heuristic for the orthogonal stock-cutting problem, Operations Research, 52, 4, pp. 655-671, (2004)
  • [10] Leung SCH, Zhang DF, Zhou CL, Et al., A hybrid simulated annealing metaheuristic algorithm for the two-dimensional knapsack packing problem, Computers & Operations Research, 39, 1, pp. 64-73, (2012)