Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

被引:0
|
作者
Perrusquia, Adolfo [1 ]
Zou, Mengbang [1 ]
Guo, Weisi [1 ]
机构
[1] Cranfield Univ, Sch Aerosp Transport & Mfg, Bedford MK43 0AL, England
关键词
Q-learning; State-transition function; Explainable Q-learning (XQL); Control policy; REINFORCEMENT; IDENTIFICATION;
D O I
10.1016/j.ins.2024.121283
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
    Yang, Yunjie
    Wan, Yan
    Zhu, Jihong
    Lewis, Frank L.
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
  • [32] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
    Zhang, Kun
    Liu, Xuantong
    Zhang, Lei
    Chen, Qian
    Peng, Yunjian
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
  • [33] Data-driven output consensus for a class of discrete-time multiagent systems by reinforcement learning techniques
    Liu, Yuanshan
    Xia, Yude
    Huang, Jingxin
    SIGNAL PROCESSING, 2024, 223
  • [34] An Indirect Data-Driven Method for Trajectory Tracking Control of a Class of Nonlinear Discrete-Time Systems
    Wang, Zhuo
    Lu, Renquan
    Gao, Furong
    Liu, Derong
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 64 (05) : 4121 - 4129
  • [35] Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
    Rizvi, Syed Ali Asad
    Lin, Zongli
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1523 - 1536
  • [36] Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems
    Li, Chun
    Ding, Jinliang
    Lewis, Frank L.
    Chai, Tianyou
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3191 - 3201
  • [37] A Novel Data-Driven Terminal Iterative Learning Control with Iteration Prediction Algorithm for a Class of Discrete-Time Nonlinear Systems
    Jin, Shangtai
    Hou, Zhongsheng
    Chi, Ronghu
    JOURNAL OF APPLIED MATHEMATICS, 2014,
  • [38] An Optimal Tracking Control Method with Q-learning for Discrete-time Linear Switched System
    Zhao, Shangwei
    Wang, Jingcheng
    Wang, Hongyuan
    Xu, Haotian
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1414 - 1419
  • [39] Adaptive iterative learning control based on IF-THEN rules and data-driven scheme for a class of nonlinear discrete-time systems
    Treesatayapun, Chidentree
    SOFT COMPUTING, 2018, 22 (02) : 487 - 497
  • [40] Data-driven adaptive optimal control for discrete-time periodic systems
    Wu, Ai-Guo
    Meng, Yuan
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024,