A model-based reinforcement learning approach for maintenance optimization of degrading systems in a large state space

被引：26

作者：

Zhang, Ping ^{[1
,2
]}

Zhu, Xiaoyan ^{[1
]}

Xie, Min ^{[2
,3
]}

机构：

[1] Univ Chinese Acad Sci, Sch Econ & Management, Bldg 7,80 Zhongguancun East Rd, Beijing, Peoples R China

[2] City Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

[3] City Univ Hong Kong, Sch Data Sci, Hong Kong, Peoples R China

来源：

COMPUTERS & INDUSTRIAL ENGINEERING | 2021年 / 161卷

基金：

中国国家自然科学基金;

关键词：

Maintenance optimization; Periodic inspection; Model-based reinforcement learning; Degrading system; PREDICTIVE MAINTENANCE; DEGRADATION; RELIABILITY; POLICY; ANALYTICS; SUBJECT; PARTS;

D O I：

10.1016/j.cie.2021.107622

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Scheduling maintenance tasks based on the deteriorating process has often been established on degradation models. However, the formulas of the degradation processes are usually unknown and hard to be determined for a system working in practices. In this study, we develop a model-based reinforcement learning approach for maintenance optimization. The developed approach determines maintenance actions for each degradation state at each inspection time over a finite planning horizon, supposing that the degradation formula is known or unknown. At each inspection time, the developed approach attempts to learn an optimal assessment value for each maintenance action to be performed at each degradation state. The assessment value quantifies the goodness of each state-action pair in terms of minimizing the accumulated maintenance costs over the planning horizon. To optimize the assessment values when a well-defined degradation formula is known, we customize a Q-learning method with model-based acceleration. When the degradation formula is unknown or hard to be determined, we develop a Dyna-Q method with maintenance-oriented improvements, in which an environment model capturing the degradation pattern under different maintenance actions is learned at first; Then, the assessment values are optimized while considering the stochastic behavior of the system degradation. The final maintenance policy is acquired by performing the maintenance actions associated with the highest assessment values. Experimental studies are presented to illustrate the applications.

引用

页数：14

共 50 条

[21] Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties
Mahmud, S. M. Nahid
Nivison, Scott A.
Bell, Zachary I.
Kamalapurkar, Rushikesh
FRONTIERS IN ROBOTICS AND AI, 2021, 8
[22] Model-Based Reinforcement Learning via Latent-Space Collocation
Rybkin, Oleh
Zhu, Chuning
Nagabandi, Anusha
Daniilidis, Kostas
Mordatch, Igor
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[23] Model-based safe reinforcement learning for nonlinear systems under uncertainty with constraints tightening approach
Kim, Yeonsoo
Oh, Tae Hoon
COMPUTERS & CHEMICAL ENGINEERING, 2024, 183
[24] Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning
Wu, Chenyang
Li, Tianci
Zhang, Zongzhang
Yu, Yang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[25] BiES: Adaptive Policy Optimization for Model-Based Offline Reinforcement Learning
Yang, Yijun
Jiang, Jing
Wang, Zhuowei
Duan, Qiqi
Shi, Yuhui
AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 : 570 - 581
[26] Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
Zhang, Shenao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[27] A Deep Reinforcement Learning Model-Based Optimization Method for Graphic Design
Guo Q.
Wang Z.
Informatica (Slovenia), 2024, 48 (05): : 121 - 134
[28] Policy Optimization by Looking Ahead for Model-based Offline Reinforcement Learning
Liu, Yang
Hofert, Marius
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2791 - 2797
[29] Model-based Reinforcement Learning: A Survey
Moerland, Thomas M.
Broekens, Joost
Plaat, Aske
Jonker, Catholijn M.
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
[30] A survey on model-based reinforcement learning
Fan-Ming LUO
Tian XU
Hang LAI
Xiong-Hui CHEN
Weinan ZHANG
Yang YU
Science China(Information Sciences), 2024, 67 (02) : 59 - 84

← 1 2 3 4 5 →