Optimization of 2D Irregular Packing: Deep Reinforcement Learning with Dense Reward

被引:0
|
作者
Crescitelli, Viviana [1 ]
Oshima, Takashi [1 ]
机构
[1] Hitachi Ltd, Res & Dev Grp, Tokyo, Japan
关键词
Irregular packing; reinforcement learning; factory automation; machine learning; reward; ALGORITHM;
D O I
10.1142/S1793351X24430025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a method to solve the 2D irregular packing problem using Deep Reinforcement Learning (Deep RL) for logistics. Our method employs a Q agent trained to predict the best placement within a container, maximizing available space. Unlike previous Deep RL algorithms, our method introduces a dense reward function at each packing step, providing immediate feedback and accelerating learning. To our knowledge, this is the first approach to use a dense reward to address the 2D irregular packing problem. Building on our earlier work, we improve the deep neural network by incorporating the Double Deep Q-Network (DDQN) framework to enhance our deep Q-learning approach, reducing overestimation biases and improving decision-making reliability. Simulation results show the method's effectiveness in completing the online 2D irregular packing tasks, achieving promising volume efficiency and packed piece metrics. This research extends our initial findings, highlighting the practical importance of DDQN and dense reward in advancing 2D irregular packing problem-solving. These advancements not only broaden the applications of deep learning but also hold practical importance for real-world logistics challenges.
引用
收藏
页码:405 / 416
页数:12
相关论文
共 50 条
  • [31] 3D Vision robot online packing platform for deep reinforcement learning
    Mu, Xingyu
    Kan, Quanmin
    Jiang, Yong
    Chang, Chao
    Tian, Xincheng
    Zhou, Lelai
    Zhao, Yongguo
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2025, 94
  • [32] Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems
    H. Terashima-Marín
    P. Ross
    C. J. Farías-Zárate
    E. López-Camacho
    M. Valenzuela-Rendón
    Annals of Operations Research, 2010, 179 : 369 - 392
  • [33] EXTENDED LOCAL SEARCH AND POLYGON GROUPING FOR 2D IRREGULAR STRIP PACKING PROBLEM
    Anggraeny, Fetty Tri
    Suciati, Nanik
    Yuniarti, Anny
    2013 INTERNATIONAL CONFERENCE ON ICT FOR SMART SOCIETY (ICISS): THINK ECOSYSTEM ACT CONVERGENCE, 2013, : 10 - 15
  • [34] Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems
    Terashima-Marin, H.
    Ross, P.
    Farias-Zarate, C. J.
    Lopez-Camacho, E.
    Valenzuela-Rendon, M.
    ANNALS OF OPERATIONS RESEARCH, 2010, 179 (01) : 369 - 392
  • [35] Crowdsourcing solutions to 2D irregular strip packing problems from Internet workers
    Vasantha, Gokula Vijaykumar Annamalai
    Jagadeesan, Ananda Prasanna
    Corney, Jonathan Roy
    Lynn, Andrew
    Agrawal, Anupam
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2016, 54 (14) : 4104 - 4125
  • [36] Reward of Reinforcement Learning of Test Optimization for Continuous Integration
    He L.-L.
    Yang Y.
    Li Z.
    Zhao R.-L.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (05): : 1438 - 1449
  • [37] Navigation of Mobile Robots Based on Deep Reinforcement Learning: Reward Function Optimization and Knowledge Transfer
    Li, Weijie
    Yue, Ming
    Shangguan, Jinyong
    Jin, Ye
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (02) : 563 - 574
  • [38] Power Optimization in Device-to-Device Communications: A Deep Reinforcement Learning Approach With Dynamic Reward
    Ji, Zelin
    Kiani, Adnan K.
    Qin, Zhijin
    Ahmad, Rizwan
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2021, 10 (03) : 508 - 511
  • [39] Navigation of Mobile Robots Based on Deep Reinforcement Learning: Reward Function Optimization and Knowledge Transfer
    Weijie Li
    Ming Yue
    Jinyong Shangguan
    Ye Jin
    International Journal of Control, Automation and Systems, 2023, 21 : 563 - 574
  • [40] Optimization framework of laser oscillation welding based on a deep predictive reward reinforcement learning net
    Tian, Wenhao
    Hu, Peipei
    Zhang, Chen
    JOURNAL OF INTELLIGENT MANUFACTURING, 2024,