Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引：0

作者：

Perrusquia, Adolfo ^{[1
]}

Zou, Mengbang ^{[1
]}

Guo, Weisi ^{[1
]}

机构：

[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England

来源：

INFORMATION SCIENCES | 2024年 / 682卷

关键词：

Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;

D O I：

10.1016/j.ins.2024.121283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.

引用

页数：15

共 50 条

[31] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
Yang, Yunjie
Wan, Yan
Zhu, Jihong
Lewis, Frank L.
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
[32] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
Zhang, Kun
Liu, Xuantong
Zhang, Lei
Chen, Qian
Peng, Yunjian
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
[33] Data-driven output consensus for a class of discrete-time multiagent systems by reinforcement learning techniques
Liu, Yuanshan
Xia, Yude
Huang, Jingxin
SIGNAL PROCESSING, 2024, 223
[34] An Indirect Data-Driven Method for Trajectory Tracking Control of a Class of Nonlinear Discrete-Time Systems
Wang, Zhuo
Lu, Renquan
Gao, Furong
Liu, Derong
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 64 (05) : 4121 - 4129
[35] Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1523 - 1536
[36] Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems
Li, Chun
Ding, Jinliang
Lewis, Frank L.
Chai, Tianyou
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3191 - 3201
[37] A Novel Data-Driven Terminal Iterative Learning Control with Iteration Prediction Algorithm for a Class of Discrete-Time Nonlinear Systems
Jin, Shangtai
Hou, Zhongsheng
Chi, Ronghu
JOURNAL OF APPLIED MATHEMATICS, 2014,
[38] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
Zhao, Shangwei
Wang, Jingcheng
Wang, Hongyuan
Xu, Haotian
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419
[39] Adaptive iterative learning control based on IF-THEN rules and data-driven scheme for a class of nonlinear discrete-time systems
Treesatayapun, Chidentree
SOFT COMPUTING, 2018, 22 (02) : 487 - 497
[40] Data-driven adaptive optimal control for discrete-time periodic systems
Wu, Ai-Guo
Meng, Yuan
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024,

← 1 2 3 4 5 →