Learning to multi-vehicle cooperative bin packing problem via sequence-to-sequence policy network with deep reinforcement learning model

被引:7
|
作者
Tian, Ran [1 ]
Kang, Chunming [1 ]
Bi, Jiaming [1 ]
Ma, Zhongyu [1 ]
Liu, Yanxing [1 ]
Yang, Saisai [1 ]
Li, Fangfang [1 ]
机构
[1] Northwest Normal Univ, Dept Coll Comp Sci & Engn, Lanzhou 730070, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep Reinforcement Learning; 3D Bin Packing Policy; Position Sequence; Logistics Packing; SEARCH ALGORITHM; LOCAL SEARCH; SUPPLY CHAIN; OPTIMIZATION;
D O I
10.1016/j.cie.2023.108998
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the logistics bin packing scenario with only rear bin doors, the packing sequence of items determines the utilization of vehicle packing space, but there is relatively little research on optimizing the packing sequence of items. Therefore, this article focuses on the bin packing sequence problem in the multi-vehicle cooperative bin packing problem(MVCBPP) and proposes a deep reinforcement learning model based on the sequence-to -sequence policy network with deep reinforcement learning model(S2SDRL). Firstly, the sequence-to-sequence neural networks model is constructed, which determines the packing probability of all items. The items will be packed by combining the bidirectional LSTM model and the attention module to construct the encoder and decoder. Secondly, the bin packing strategy of the items is obtained by the constructed reinforcement learning packing framework. Finally, the Seq2Seq policy network is updated and optimized by the policy gradient method with a baseline to obtain the current optimal packing strategy. In several bin packing scenarios, S2SDRL im-proves the average vehicle space utilization by more than 4.0% compared with the traditional packing algorithm, and the forward computation time of the model is much smaller than that of the traditional heuristic algorithm, so the model also has more realistic application value. Ablation experiments also confirm the effectiveness of the modules in the S2SDRL model for optimization of the packing order. The sensitivity analysis shows the model's some stability when the input data changes.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Multi-Agent Reinforcement Learning is A Sequence Modeling Problem
    Wen, Muning
    Kuba, Jakub Grudzien
    Lin, Runji
    Zhang, Weinan
    Wen, Ying
    Wang, Jun
    Yang, Yaodong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [22] A novel sequence-to-sequence based deep learning model for satellite cloud image time series prediction
    Lian, Jie
    Wu, Shixin
    Huang, Sirong
    Zhao, Qin
    ATMOSPHERIC RESEARCH, 2024, 306
  • [23] Sequence-to-Sequence Neural Network Model with 2D Attention for Learning Japanese Pitch Accents
    Bruguier, Antoine
    Zen, Heiga
    Arkhangorodsky, Arkady
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1284 - 1287
  • [24] Sequence-to-Sequence Multi-Agent Reinforcement Learning for Multi-UAV Task Planning in 3D Dynamic Environment
    Liu, Ziwei
    Qiu, Changzhen
    Zhang, Zhiyong
    APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [25] Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance
    Jackson, Ethan C.
    Daley, Mark
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 173 - 174
  • [26] Capacity planning in logistics corridors: Deep reinforcement learning for the dynamic stochastic temporal bin packing problem
    Farahani, Amirreza
    Genga, Laura
    Schrotenboer, Albert H.
    Dijkman, Remco
    TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2024, 191
  • [27] Deep Learning with Long Short Term Memory Based Sequence-to-Sequence Model for Rainfall-Runoff Simulation
    Han, Heechan
    Choi, Changhyun
    Jung, Jaewon
    Kim, Hung Soo
    WATER, 2021, 13 (04)
  • [28] A Novel Deep-learning based Approach for Automatic Diacritization of Arabic Poems using Sequence-to-Sequence Model
    Mahmoud, Mohamed S.
    Negied, Nermin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (01) : 42 - 46
  • [29] A sequence-to-sequence model-based deep learning approach for recognizing activity of daily living for senior care
    Zhu, Hongyi
    Chen, Hsinchun
    Brown, Randall
    JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 84 : 148 - 158
  • [30] DYNAMIC FORECASTING OF PATIENT-SPECIFIC KIDNEY TRANSPLANT FUNCTION WITH A SEQUENCE-TO-SEQUENCE DEEP-LEARNING MODEL
    van Loon, Elisabet
    Zhang, Wanqiu
    van Craenenbroeck, Amaryllis
    de Moor, Bart
    Naesens, Maarten
    TRANSPLANT INTERNATIONAL, 2021, 34 : 62 - 62