A two-stage RNN-based deep reinforcement learning approach for solving the parallel machine scheduling problem with due dates and family setups
被引:0
|
作者:
Funing Li
论文数: 0引用数: 0
h-index: 0
机构:Otto von Guericke University Magdeburg,Institute of Logistics and Material Handling Systems
Funing Li
Sebastian Lang
论文数: 0引用数: 0
h-index: 0
机构:Otto von Guericke University Magdeburg,Institute of Logistics and Material Handling Systems
Sebastian Lang
Bingyuan Hong
论文数: 0引用数: 0
h-index: 0
机构:Otto von Guericke University Magdeburg,Institute of Logistics and Material Handling Systems
Bingyuan Hong
Tobias Reggelin
论文数: 0引用数: 0
h-index: 0
机构:Otto von Guericke University Magdeburg,Institute of Logistics and Material Handling Systems
Tobias Reggelin
机构:
[1] Otto von Guericke University Magdeburg,Institute of Logistics and Material Handling Systems
[2] Fraunhofer Institute for Factory Operation and Automation IFF,National
[3] Zhejiang Ocean University,Local Joint Engineering Laboratory of Harbor Oil & Gas Storage and Transportation Technology/Zhejiang Provincial Key Laboratory of Petrochemical Pollution Control/School of Petrochemical Engineering and Environment
Deep reinforcement learning;
Parallel machine scheduling;
Family setups;
Recurrent neural network;
D O I:
暂无
中图分类号:
学科分类号:
摘要:
As an essential scheduling problem with several practical applications, the parallel machine scheduling problem (PMSP) with family setups constraints is difficult to solve and proven to be NP-hard. To this end, we present a deep reinforcement learning (DRL) approach to solve a PMSP considering family setups, aiming at minimizing the total tardiness. The PMSP is first modeled as a Markov decision process, where we design a novel variable-length representation of states and actions, so that the DRL agent can calculate a comprehensive priority for each job at each decision time point and then select the next job directly according to these priorities. Meanwhile, the variable-length state matrix and action vector enable the trained agent to solve instances of any scales. To handle the variable-length sequence and simultaneously ensure the calculated priority is a global priority among all jobs, we employ a recurrent neural network, particular gated recurrent unit, to approximate the policy of the agent. The agent is trained based on Proximal Policy Optimization algorithm. Moreover, we develop a two-stage training strategy to enhance the training efficiency. In the numerical experiments, we first train the agent on a given instance and then employ it to solve instances with much larger scales. The experimental results demonstrate the strong generalization capability of the trained agent and the comparison with three dispatching rules and two metaheuristics further validates the superiority of this agent.
机构:
Seoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
Seoul Natl Univ, Inst Ind Syst Innovat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
Paeng, Bohyung
Park, In-Beom
论文数: 0引用数: 0
h-index: 0
机构:
Sungkyunkwan Univ, Dept Ind Engn, Suwon 16419, South KoreaSeoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
Park, In-Beom
Park, Jonghun
论文数: 0引用数: 0
h-index: 0
机构:
Seoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
Seoul Natl Univ, Inst Ind Syst Innovat, Seoul 08826, South KoreaSeoul Natl Univ, Dept Ind Engn, Seoul 08826, South Korea
机构:
Space Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R ChinaSpace Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R China
Liu, Zheng
Xiong, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Space Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R ChinaSpace Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R China
Xiong, Wei
Jia, Zhuoya
论文数: 0引用数: 0
h-index: 0
机构:
Space Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R ChinaSpace Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R China
Jia, Zhuoya
Han, Chi
论文数: 0引用数: 0
h-index: 0
机构:
Space Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R ChinaSpace Engn Univ, Natl Key Lab Space Target Awareness, Beijing 101416, Peoples R China
机构:
Shiga Univ, Fac Data Sci, Hikone 5228522, JapanShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Zhou, Xiaokang
Liang, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Hunan Univ Technol & Business, Base Int Sci & Technol Innovat & Cooperat Big Data, Changsha 410205, Peoples R ChinaShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Liang, Wei
Yan, Ke
论文数: 0引用数: 0
h-index: 0
机构:
Natl Univ Singapore, Coll Design & Engn, Singapore 117566, SingaporeShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Yan, Ke
Li, Weimin
论文数: 0引用数: 0
h-index: 0
机构:
Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R ChinaShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Li, Weimin
Wang, Kevin I-Kai
论文数: 0引用数: 0
h-index: 0
机构:
Univ Auckland, Dept Elect Comp & Software Engn, Auckland 1010, New ZealandShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Wang, Kevin I-Kai
Ma, Jianhua
论文数: 0引用数: 0
h-index: 0
机构:
Hosei Univ, Fac Comp & Informat Sci, Tokyo 1028160, JapanShiga Univ, Fac Data Sci, Hikone 5228522, Japan
Ma, Jianhua
Jin, Qun
论文数: 0引用数: 0
h-index: 0
机构:
Waseda Univ, Fac Human Sci, Tokorozawa 3591192, JapanShiga Univ, Fac Data Sci, Hikone 5228522, Japan
机构:
Univ Michigan, Ind & Operat Engn, Ann Arbor, MI 48104 USA
Univ Michigan, Ctr Healthcare Engn & Patient Safety, Ann Arbor, MI 48104 USAUniv Michigan, Ind & Operat Engn, Ann Arbor, MI 48104 USA
Guo, Junhong
Pozehl, William
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Ctr Healthcare Engn & Patient Safety, Ann Arbor, MI 48104 USAUniv Michigan, Ind & Operat Engn, Ann Arbor, MI 48104 USA