Learning Optimal Behavior in Environments with Non-stationary Observations

被引:0
|
作者
Boone, Ilio [1 ]
Rens, Gavin [1 ]
机构
[1] Katholieke Univ Leuven, DTAI Grp, Leuven, Belgium
关键词
Markov Decision Process; Non-Markovian Reward Models; Mealy Reward Model (MRM); Learning MRMs; Non-stationary;
D O I
10.5220/0010898200003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In sequential decision-theoretic systems, the dynamics might be Markovian (behavior in the next step is independent of the past, given the present), or non-Markovian (behavior in the next step depends on the past). One approach to represent non-Markovian behaviour has been to employ deterministic finite automata (DFA) with inputs and outputs (e.g. Mealy machines). Moreover, some researchers have proposed frameworks for learning DFA-based models. There are at least two reasons for a system to be non-Markovian: (i) rewards are gained from temporally-dependent tasks, (ii) observations are non-stationary. Rens et al. (2021) tackle learning the applicable DFA for the first case with their ARM algorithm. ARM cannot deal with the second case. Toro Icarte et al. (2019) tackle the problem for the second case with their LRM algorithm. In this paper, we extend ARM to deal with the second case too. The advantage of ARM for learning and acting in non-Markovian systems is that it is based on well-understood formal methods with many available tools.
引用
收藏
页码:729 / 736
页数:8
相关论文
共 50 条
  • [21] An adaptable fuzzy reinforcement learning method for non-stationary environments
    Haighton, Rachel
    Asgharnia, Amirhossein
    Schwartz, Howard
    Givigi, Sidney
    NEUROCOMPUTING, 2024, 604
  • [22] Reliable Localized On-line Learning in Non-stationary Environments
    Buschermoehle, Andreas
    Brockmann, Werner
    2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,
  • [23] Adaptive Learning With Extreme Verification Latency in Non-Stationary Environments
    Idrees, Mobin M. M.
    Stahl, Frederic
    Badii, Atta
    IEEE ACCESS, 2022, 10 : 127345 - 127364
  • [24] Detection and estimation in non-stationary environments
    Toolan, TM
    Tufts, DW
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 797 - 801
  • [25] Adaptive beamforming in non-stationary environments
    Cox, H
    THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 431 - 438
  • [26] Rewiring Neurons in Non-Stationary Environments
    Sun, Zhicheng
    Mu, Yadong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] FLOODING RISK ASSESSMENT IN STATIONARY AND NON-STATIONARY ENVIRONMENTS
    Thomson, Rhys
    Drynan, Leo
    Ball, James
    Veldema, Ailsa
    Phillips, Brett
    Babister, Mark
    PROCEEDINGS OF THE 36TH IAHR WORLD CONGRESS: DELTAS OF THE FUTURE AND WHAT HAPPENS UPSTREAM, 2015, : 5167 - 5177
  • [28] Incremental learning with ensemble based SVM classifiers for non-stationary environments
    Yalcin, Aycan
    Erdem, Zeki
    Guergen, Fikret
    2007 IEEE 15TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1-3, 2007, : 1208 - 1211
  • [29] A Model Falsification Approach to Learning in Non-Stationary Environments for Experimental Design
    Murari, Andrea
    Lungaroni, Michele
    Peluso, Emmanuele
    Craciunescu, Teddy
    Gelfusa, Michela
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [30] Adaptive Learning with Covariate Shift-Detection for Non-Stationary Environments
    Raza, Haider
    Prasad, Girijesh
    Li, Yuhua
    2014 14TH UK WORKSHOP ON COMPUTATIONAL INTELLIGENCE (UKCI), 2014, : 73 - 80