Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

被引:4
|
作者
Luo, Yuwei [1 ]
Gupta, Varun [2 ]
Kolar, Mladen [2 ]
机构
[1] Stanford Univ, 655 Knight Way, Stanford, CA 94305 USA
[2] Univ Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637 USA
关键词
Linear Quadratic Regulator; dynamic regret; non-stationary learning; ordinary least squares estimator; TIME;
D O I
10.1145/3508029
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q, R, but unknown and non-stationary dynamics {A(t), B-t}. The sequence of dynamics matrices can be arbitrary, but with a total variation, V-T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of (O) over tilde ((VTT3/5)-T-2/5). With piecewise constant dynamics, our algorithm achieves the optimal regret of (O) over tilde(root ST) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of V-T. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.
引用
收藏
页数:72
相关论文
共 50 条
  • [21] Testing for autocorrelation in non-stationary dynamic systems of equations
    Hussain, S
    Shukur, G
    JOURNAL OF APPLIED STATISTICS, 2003, 30 (04) : 441 - 454
  • [22] Non-stationary Projection-Free Online Learning with Dynamic and Adaptive Regret Guarantees
    Wang, Yibo
    Yang, Wenhao
    Jiang, Wei
    Lu, Shiyin
    Wang, Bing
    Tang, Haihong
    Wan, Yuanyu
    Zhang, Lijun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15671 - 15679
  • [23] Estimation of non-stationary delay for a linear discrete dynamic plant
    A. Yu. Torgashov
    Automation and Remote Control, 2009, 70 : 1140 - 1152
  • [24] Estimation of non-stationary delay for a linear discrete dynamic plant
    Torgashov, A. Yu.
    AUTOMATION AND REMOTE CONTROL, 2009, 70 (07) : 1140 - 1152
  • [25] The stability and stabilization of non-linear, non-stationary mechanical systems
    Aleksandrov, A. Yu.
    Kosov, A. A.
    PMM JOURNAL OF APPLIED MATHEMATICS AND MECHANICS, 2010, 74 (05): : 553 - 562
  • [26] Discrete identification of continuous non-linear and non-stationary dynamical systems that is insensitive to noise correlation and measurement outliers
    Kozlowski, Janusz
    Kowalczuk, Zdzislaw
    ARCHIVES OF CONTROL SCIENCES, 2023, 33 (02) : 391 - 411
  • [27] Artificial life approach for continuous optimisation of non-stationary dynamical systems
    Annunziato, M
    Bruni, C
    Lucchetti, M
    Pizzuti, S
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2003, 10 (02) : 111 - 125
  • [28] Integral Models of Non-linear Non-stationary Systems and Their Applications
    Solodusha, S.
    Orlova, I
    2017 INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING, APPLICATIONS AND MANUFACTURING (ICIEAM), 2017,
  • [29] Identification of non-stationary dynamical systems using multivariate ARMA models
    Bertha, Mathieu
    Golinval, Jean-Claude
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2017, 88 : 166 - 179
  • [30] ALMOST SURE INVARIANCE PRINCIPLE FOR SEQUENTIAL AND NON-STATIONARY DYNAMICAL SYSTEMS
    Haydn, Nicolai
    Nicol, Matthew
    Torok, Andrew
    Vaienti, Sandro
    TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 2017, 369 (08) : 5293 - 5316