Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics

被引:18
|
作者
Wen, Xin [1 ]
Shi, Huiyuan [1 ,2 ,3 ]
Su, Chengli [1 ,4 ,7 ]
Jiang, Xueying [5 ]
Li, Ping [1 ,4 ]
Yu, Jingxian [6 ]
机构
[1] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun, Peoples R China
[2] Northwestern Polytech Univ, Sch Automat, Xian, Peoples R China
[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China
[4] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan, Peoples R China
[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang, Peoples R China
[6] Liaoning Petrochem Univ, Sch Sci, Fushun, Peoples R China
[7] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
基金
中国国家自然科学基金;
关键词
Batchprocess; Data-driven; 2Doff-policyQ-learning; Optimaltrackingcontrol; Injectionmolding; MODEL PREDICTIVE CONTROL; FAULT-TOLERANT CONTROL; STATE DELAY; DESIGN; FEEDBACK;
D O I
10.1016/j.isatra.2021.06.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In view that the previous control methods usually rely too much on the models of batch process and have difficulty in a practical batch process with unknown dynamics, a novel data-driven twodimensional (2D) off-policy Q-learning approach for optimal tracking control (OTC) is proposed to make the batch process obtain a model-free control law. Firstly, an extended state space equation composing of the state and output error is established for ensuring tracking performance of the designed controller. Secondly, the behavior policy of generating data and the target policy of optimization as well as learning is introduced based on this extended system. Then, the Bellman equation independent of model parameters is given via analyzing the relation between 2D value function and 2D Q-function. The measured data along the batch and time directions of batch process are just taken to carry out the policy iteration, which can figure out the optimal control problem despite lacking systematic dynamic information. The unbiasedness and convergence of the designed 2D off-policy Q-learning algorithm are proved. Finally, a simulation case for injection molding process manifests that control effect and tracking effect gradually become better with the increasing number of batches.(c) 2021 ISA. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:10 / 21
页数:12
相关论文
共 50 条
  • [1] Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics
    Shi, Huiyuan
    Yang, Chen
    Jiang, Xueying
    Su, Chengli
    Li, Ping
    JOURNAL OF PROCESS CONTROL, 2022, 113 : 29 - 41
  • [2] Optimal tracking control of nonlinear batch processes with unknown dynamics using two-dimensional off-policy interleaved Q-learning algorithm
    Shi, Huiyuan
    Gao, Wei
    Jiang, Xueying
    Su, Chengli
    Li, Ping
    INTERNATIONAL JOURNAL OF CONTROL, 2024, 97 (10) : 2329 - 2341
  • [3] Data-Driven Tracking Control for Multi-Agent Systems With Unknown Dynamics via Multithreading Iterative Q-Learning
    Dong, Tao
    Gong, Xiaomei
    Wang, Aijuan
    Li, Huaqing
    Huang, Tingwen
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (04): : 2533 - 2542
  • [4] Optimal tracking control of batch processes with time-invariant state delay: Adaptive Q-learning with two-dimensional state and control policy
    Shi, Huiyuan
    Lv, Mengdi
    Jiang, Xueying
    Su, Chengli
    Li, Ping
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [5] Data-Driven Two-Dimensional Deep Correlated Representation Learning for Nonlinear Batch Process Monitoring
    Jiang, Qingchao
    Yan, Shifu
    Yan, Xuefeng
    Yi, Hui
    Gao, Furong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (04) : 2839 - 2848
  • [6] Data-driven two-dimensional integrated control for nonlinear batch processes
    Zhou, Chengyu
    Jia, Li
    Li, Jianfang
    Chen, Yan
    JOURNAL OF PROCESS CONTROL, 2024, 135
  • [7] Safe Q-Learning for Data-Driven Nonlinear Optimal Control with Asymmetric State Constraints
    Zhao, Mingming
    Wang, Ding
    Song, Shijie
    Qiao, Junfei
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (12) : 2408 - 2422
  • [8] Safe Q-Learning for Data-Driven Nonlinear Optimal Control With Asymmetric State Constraints
    Mingming Zhao
    Ding Wang
    Shijie Song
    Junfei Qiao
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (12) : 2408 - 2422
  • [9] A Combined Policy Gradient and Q-learning Method for Data-driven Optimal Control Problems
    Lin, Mingduo
    Liu, Derong
    Zhao, Bo
    Dai, Qionghai
    Dong, Yi
    2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 6 - 10
  • [10] Data-driven tracking control approach for linear systems by on-policy Q-learning approach
    Zhang Yihan
    Mao Zhenfei
    Li Jinna
    16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 1066 - 1070