Bin Packing Optimization via Deep Reinforcement Learning

被引：0

作者：

Wang, Baoying ^{[1
]}

Lin, Zhaohui ^{[1
]}

Kong, Weijie ^{[1
]}

Dong, Huixu ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Mech Engn Dept, Grasp Lab, Hangzhou 310027, Peoples R China

[2] Zhejiang Key Lab Ind Big Data & Robot Intelligent, Hangzhou 310058, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2025年 / 10卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Three-dimensional displays; Robots; Genetic algorithms; Costs; Deep reinforcement learning; Decoding; Accuracy; Search problems; Logistics; Convolution; Reinforcement learning; manipulation planning; bin packing problem (BPP); robot packing; ROBOTIC MANIPULATION; ALGORITHM;

D O I：

10.1109/LRA.2025.3534070

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The bin packing problem (BPP) has attracted enthusiastic research interest recently, owing to its widespread applications in logistics and warehousing environments. It is truly essential to optimize the bin packing to enable more objects to be packed into bins, in which the object packing order and placement position are the two crucial optimization goals. However, existing optimization methods for BPP, such as the genetic algorithm (GA), emerge as the primary issues in highly time cost and relatively low accuracy, making it difficult to implement in realistic scenarios. To well relieve related research gaps, we present a novel optimization method of 2D and 3D BPP for objects with regular shapes via deep reinforcement learning (DRL), maximizing the space utilization and minimizing the usage number of bins. First, an end-to-end DRL neural network constructed by a modified Pointer Network consisting of an encoder, a decoder and an attention module is proposed to achieve the optimal object packing order. Second, conforming to the top-down operation mode, the placement strategy based on a height map is used to determine the placement positions of the ordered objects in the bins, preventing the objects from colliding with bins and other objects in bins. Third, the reward and loss functions are defined as the indicators of the compactness, pyramid, and usage number of bins to conduct the DRL neural network training based on an on-policy actor-critic framework. Finally, we conduct extensive experiments to evaluate the performance of the proposed method, and demonstrate that our method achieves a 3% improvement and more than 50x time saving over the GA. Further, an experiment on robotic packing is implemented to validate its generalization capacity in the realistic environment.

引用

页码：2542 / 2549

页数：8

共 50 条

[1] Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming
Jiang, Yuan
Cao, Zhiguang
Zhang, Jie
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2864 - 2875
[2] Online 3D Bin Packing with Constrained Deep Reinforcement Learning
Zhao, Hang
She, Qijin
Zhu, Chenyang
Yang, Yin
Xu, Kai
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 741 - 749
[3] Integrating Heuristic Methods with Deep Reinforcement Learning for Online 3D Bin-Packing Optimization
Wong, Ching-Chang
Tsai, Tai-Ting
Ou, Can-Kun
SENSORS, 2024, 24 (16)
[4] Heuristics Integrated Deep Reinforcement Learning for Online 3D Bin Packing
Yang, Shuo
Song, Shuai
Chu, Shilei
Song, Ran
Cheng, Jiyu
Li, Yibin
Zhang, Wei
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (01) : 939 - 950
[5] GOPT: Generalizable Online 3D Bin Packing via Transformer-Based Deep Reinforcement Learning
Xiong, Heng
Guo, Changrong
Peng, Jian
Ding, Kai
Chen, Wenjie
Qiu, Xuchong
Bai, Long
Xu, Jianfeng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10335 - 10342
[6] A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies
Tsang, Y. P.
Mo, D. Y.
Chung, K. T.
Lee, C. K. M.
COMPUTERS IN INDUSTRY, 2025, 164
[7] Optimization of Molecules via Deep Reinforcement Learning
Zhenpeng Zhou
Steven Kearnes
Li Li
Richard N. Zare
Patrick Riley
Scientific Reports, 9
[8] Optimization of Molecules via Deep Reinforcement Learning
Zhou, Zhenpeng
Kearnes, Steven
Li, Li
Zare, Richard N.
Riley, Patrick
SCIENTIFIC REPORTS, 2019, 9 (1)
[9] Learning to multi-vehicle cooperative bin packing problem via sequence-to-sequence policy network with deep reinforcement learning model
Tian, Ran
Kang, Chunming
Bi, Jiaming
Ma, Zhongyu
Liu, Yanxing
Yang, Saisai
Li, Fangfang
COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 177
[10] Capacity planning in logistics corridors: Deep reinforcement learning for the dynamic stochastic temporal bin packing problem
Farahani, Amirreza
Genga, Laura
Schrotenboer, Albert H.
Dijkman, Remco
TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2024, 191

← 1 2 3 4 5 →