Learning to multi-vehicle cooperative bin packing problem via sequence-to-sequence policy network with deep reinforcement learning model

被引:7
|
作者
Tian, Ran [1 ]
Kang, Chunming [1 ]
Bi, Jiaming [1 ]
Ma, Zhongyu [1 ]
Liu, Yanxing [1 ]
Yang, Saisai [1 ]
Li, Fangfang [1 ]
机构
[1] Northwest Normal Univ, Dept Coll Comp Sci & Engn, Lanzhou 730070, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep Reinforcement Learning; 3D Bin Packing Policy; Position Sequence; Logistics Packing; SEARCH ALGORITHM; LOCAL SEARCH; SUPPLY CHAIN; OPTIMIZATION;
D O I
10.1016/j.cie.2023.108998
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the logistics bin packing scenario with only rear bin doors, the packing sequence of items determines the utilization of vehicle packing space, but there is relatively little research on optimizing the packing sequence of items. Therefore, this article focuses on the bin packing sequence problem in the multi-vehicle cooperative bin packing problem(MVCBPP) and proposes a deep reinforcement learning model based on the sequence-to -sequence policy network with deep reinforcement learning model(S2SDRL). Firstly, the sequence-to-sequence neural networks model is constructed, which determines the packing probability of all items. The items will be packed by combining the bidirectional LSTM model and the attention module to construct the encoder and decoder. Secondly, the bin packing strategy of the items is obtained by the constructed reinforcement learning packing framework. Finally, the Seq2Seq policy network is updated and optimized by the policy gradient method with a baseline to obtain the current optimal packing strategy. In several bin packing scenarios, S2SDRL im-proves the average vehicle space utilization by more than 4.0% compared with the traditional packing algorithm, and the forward computation time of the model is much smaller than that of the traditional heuristic algorithm, so the model also has more realistic application value. Ablation experiments also confirm the effectiveness of the modules in the S2SDRL model for optimization of the packing order. The sensitivity analysis shows the model's some stability when the input data changes.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Substation Operation Sequence Inference Model Based on Deep Reinforcement Learning
    Chen, Tie
    Li, Hongxin
    Cao, Ying
    Zhang, Zhifan
    APPLIED SCIENCES-BASEL, 2023, 13 (13):
  • [32] Using deep reinforcement learning approach for solving the multiple sequence alignment problem
    Jafari, Reza
    Javidi, Mohammad Masoud
    Rafsanjani, Marjan Kuchaki
    SN APPLIED SCIENCES, 2019, 1 (06):
  • [33] Deep reinforcement learning applied to an assembly sequence planning problem with user preferences
    Miguel Neves
    Pedro Neto
    The International Journal of Advanced Manufacturing Technology, 2022, 122 : 4235 - 4245
  • [34] Deep reinforcement learning applied to an assembly sequence planning problem with user preferences
    Neves, Miguel
    Neto, Pedro
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2022, 122 (11-12): : 4235 - 4245
  • [35] Using deep reinforcement learning approach for solving the multiple sequence alignment problem
    Reza Jafari
    Mohammad Masoud Javidi
    Marjan Kuchaki Rafsanjani
    SN Applied Sciences, 2019, 1
  • [36] A novel deep learning multi-step prediction model for dam displacement using Chrono-initialized LSTM and sequence-to-sequence framework
    Su, Yan
    Fu, Jiayuan
    Lin, Chuan
    Lai, Xiaohe
    Zheng, Zhiming
    Lin, Youlong
    He, Qiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 271
  • [37] Dynamic multi-objective sequence-wise recommendation framework via deep reinforcement learning
    Xiankun Zhang
    Yuhu Shang
    Yimeng Ren
    Kun Liang
    Complex & Intelligent Systems, 2023, 9 : 1891 - 1911
  • [38] Dynamic multi-objective sequence-wise recommendation framework via deep reinforcement learning
    Zhang, Xiankun
    Shang, Yuhu
    Ren, Yimeng
    Liang, Kun
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 1891 - 1911
  • [39] BoxStacker: Deep Reinforcement Learning for 3D Bin Packing Problem in Virtual Environment of Logistics Systems
    Murdivien, Shokhikha Amalana
    Um, Jumyung
    SENSORS, 2023, 23 (15)
  • [40] Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem
    Holler, John
    Vuorio, Risto
    Qin, Zhiwei
    Tang, Xiaocheng
    Jiao, Yan
    Jin, Tiancheng
    Singh, Satinder
    Wang, Chenxi
    Ye, Jieping
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1090 - 1095