Bin Packing Optimization via Deep Reinforcement Learning

被引:0
|
作者
Wang, Baoying [1 ]
Lin, Zhaohui [1 ]
Kong, Weijie [1 ]
Dong, Huixu [1 ,2 ]
机构
[1] Zhejiang Univ, Mech Engn Dept, Grasp Lab, Hangzhou 310027, Peoples R China
[2] Zhejiang Key Lab Ind Big Data & Robot Intelligent, Hangzhou 310058, Peoples R China
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2025年 / 10卷 / 03期
基金
中国国家自然科学基金;
关键词
Three-dimensional displays; Robots; Genetic algorithms; Costs; Deep reinforcement learning; Decoding; Accuracy; Search problems; Logistics; Convolution; Reinforcement learning; manipulation planning; bin packing problem (BPP); robot packing; ROBOTIC MANIPULATION; ALGORITHM;
D O I
10.1109/LRA.2025.3534070
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The bin packing problem (BPP) has attracted enthusiastic research interest recently, owing to its widespread applications in logistics and warehousing environments. It is truly essential to optimize the bin packing to enable more objects to be packed into bins, in which the object packing order and placement position are the two crucial optimization goals. However, existing optimization methods for BPP, such as the genetic algorithm (GA), emerge as the primary issues in highly time cost and relatively low accuracy, making it difficult to implement in realistic scenarios. To well relieve related research gaps, we present a novel optimization method of 2D and 3D BPP for objects with regular shapes via deep reinforcement learning (DRL), maximizing the space utilization and minimizing the usage number of bins. First, an end-to-end DRL neural network constructed by a modified Pointer Network consisting of an encoder, a decoder and an attention module is proposed to achieve the optimal object packing order. Second, conforming to the top-down operation mode, the placement strategy based on a height map is used to determine the placement positions of the ordered objects in the bins, preventing the objects from colliding with bins and other objects in bins. Third, the reward and loss functions are defined as the indicators of the compactness, pyramid, and usage number of bins to conduct the DRL neural network training based on an on-policy actor-critic framework. Finally, we conduct extensive experiments to evaluate the performance of the proposed method, and demonstrate that our method achieves a 3% improvement and more than 50x time saving over the GA. Further, an experiment on robotic packing is implemented to validate its generalization capacity in the realistic environment.
引用
收藏
页码:2542 / 2549
页数:8
相关论文
共 50 条
  • [1] Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming
    Jiang, Yuan
    Cao, Zhiguang
    Zhang, Jie
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2864 - 2875
  • [2] Online 3D Bin Packing with Constrained Deep Reinforcement Learning
    Zhao, Hang
    She, Qijin
    Zhu, Chenyang
    Yang, Yin
    Xu, Kai
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 741 - 749
  • [3] Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization
    Wong, Ching-Chang
    Tsai, Tai-Ting
    Ou, Can-Kun
    SENSORS, 2024, 24 (16)
  • [4] Heuristics Integrated Deep Reinforcement Learning for Online 3D Bin Packing
    Yang, Shuo
    Song, Shuai
    Chu, Shilei
    Song, Ran
    Cheng, Jiyu
    Li, Yibin
    Zhang, Wei
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (01) : 939 - 950
  • [5] GOPT: Generalizable Online 3D Bin Packing via Transformer-Based Deep Reinforcement Learning
    Xiong, Heng
    Guo, Changrong
    Peng, Jian
    Ding, Kai
    Chen, Wenjie
    Qiu, Xuchong
    Bai, Long
    Xu, Jianfeng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10335 - 10342
  • [6] A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies
    Tsang, Y. P.
    Mo, D. Y.
    Chung, K. T.
    Lee, C. K. M.
    COMPUTERS IN INDUSTRY, 2025, 164
  • [7] Optimization of Molecules via Deep Reinforcement Learning
    Zhenpeng Zhou
    Steven Kearnes
    Li Li
    Richard N. Zare
    Patrick Riley
    Scientific Reports, 9
  • [8] Optimization of Molecules via Deep Reinforcement Learning
    Zhou, Zhenpeng
    Kearnes, Steven
    Li, Li
    Zare, Richard N.
    Riley, Patrick
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [9] Learning to multi-vehicle cooperative bin packing problem via sequence-to-sequence policy network with deep reinforcement learning model
    Tian, Ran
    Kang, Chunming
    Bi, Jiaming
    Ma, Zhongyu
    Liu, Yanxing
    Yang, Saisai
    Li, Fangfang
    COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 177
  • [10] Capacity planning in logistics corridors: Deep reinforcement learning for the dynamic stochastic temporal bin packing problem
    Farahani, Amirreza
    Genga, Laura
    Schrotenboer, Albert H.
    Dijkman, Remco
    TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2024, 191