Accelerating Model-Free Reinforcement Learning With Imperfect Model Knowledge in Dynamic Spectrum Access

被引:14
|
作者
Li, Lianjun [1 ]
Liu, Lingjia [1 ]
Bai, Jianan [1 ]
Chang, Hao-Hsuan [1 ]
Chen, Hao [2 ]
Ashdown, Jonathan D. [3 ]
Zhang, Jianzhong [2 ]
Yi, Yang [1 ]
机构
[1] Virginia Tech, Elect & Comp Engn Dept, Blacksburg, VA 24061 USA
[2] Samsung Res Amer, Stand & Mobil Innovat Lab, Plano, TX 75023 USA
[3] Air Force Res Lab, Informat Directorate, Rome, NY 13441 USA
来源
IEEE INTERNET OF THINGS JOURNAL | 2020年 / 7卷 / 08期
基金
美国国家科学基金会;
关键词
Computational modeling; Learning (artificial intelligence); Sensors; Wireless communication; Acceleration; Complexity theory; Internet of Things; Dynamic spectrum access (DSA); imperfect model; reinforcement learning (RL); training acceleration; wireless communications systems; NETWORKS;
D O I
10.1109/JIOT.2020.2988268
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current studies that Our records indicate that Hao-Hsuan Chang is a Graduate Student Member of the IEEE. Please verify. Our records indicate that Jonathan D. Ashdown is a Member of the IEEE. Please verify. apply reinforcement learning (RL) to dynamic spectrum access (DSA) problems in wireless communications systems mainly focus on model-free RL (MFRL). However, in practice, MFRL requires a large number of samples to achieve good performance making it impractical in real-time applications such as DSA. Combining model-free and model-based RL can potentially reduce the sample complexity while achieving a similar level of performance as MFRL as long as the learned model is accurate enough. However, in a complex environment, the learned model is never perfect. In this article, we combine model-free and model-based RL, and introduce an algorithm that can work with an imperfectly learned model to accelerate the MFRL. Results show our algorithm achieves higher sample efficiency than the standard MFRL algorithm and the Dyna algorithm (a standard algorithm integrating model-based RL and MFRL) with much lower computation complexity than the Dyna algorithm. For the extreme case where the learned model is highly inaccurate, the Dyna algorithm performs even worse than the MFRL algorithm while our algorithm can still outperform the MFRL algorithm.
引用
收藏
页码:7517 / 7528
页数:12
相关论文
共 50 条
  • [31] Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
    Swazinna, Phillip
    Udluft, Steffen
    Hein, Daniel
    Runkler, Thomas
    IFAC PAPERSONLINE, 2022, 55 (15): : 19 - 26
  • [32] Hybrid control for combining model-based and model-free reinforcement learning
    Pinosky, Allison
    Abraham, Ian
    Broad, Alexander
    Argall, Brenna
    Murphey, Todd D.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06): : 337 - 355
  • [33] Linear Quadratic Control Using Model-Free Reinforcement Learning
    Yaghmaie, Farnaz Adib
    Gustafsson, Fredrik
    Ljung, Lennart
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) : 737 - 752
  • [34] On Distributed Model-Free Reinforcement Learning Control With Stability Guarantee
    Mukherjee, Sayak
    Vu, Thanh Long
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (05): : 1615 - 1620
  • [35] Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments
    Stulp, Freek
    Buchli, Jonas
    Ellmer, Alice
    Mistry, Michael
    Theodorou, Evangelos A.
    Schaal, Stefan
    IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2012, 4 (04) : 330 - 341
  • [36] Model-Free Recurrent Reinforcement Learning for AUV Horizontal Control
    Huo, Yujia
    Li, Yiping
    Feng, Xisheng
    3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428
  • [37] Limit Reachability for Model-Free Reinforcement Learning of ω-Regular Objectives
    Hahn, Ernst Moritz
    Perez, Mateo
    Schewe, Sven
    Somenzi, Fabio
    Trivedi, Ashutosh
    Wojtczak, Dominik
    PROCEEDINGS OF THE 5TH INTERNATIONAL WORKSHOP ON SYMBOLIC-NUMERIC METHODS FOR REASONING ABOUT CPS AND IOT (SNR 2019), 2019, : 16 - 18
  • [38] Model-Free Control for Soft Manipulators based on Reinforcement Learning
    You, Xuanke
    Zhang, Yixiao
    Chen, Xiaotong
    Liu, Xinghua
    Wang, Zhanchi
    Jiang, Hao
    Chen, Xiaoping
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2909 - 2915
  • [39] Model-Free Reinforcement Learning with the Decision-Estimation Coefficient
    Foster, Dylan J.
    Golowich, Noah
    Qian, Jian
    Rakhlin, Alexander
    Sekhari, Ayush
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [40] On the importance of hyperparameters tuning for model-free reinforcement learning algorithms
    Tejer, Mateusz
    Szezepanski, Rafal
    2024 12TH INTERNATIONAL CONFERENCE ON CONTROL, MECHATRONICS AND AUTOMATION, ICCMA, 2024, : 78 - 82