NPV-DQN: Improving Value-based Reinforcement Learning, by Variable Discount Factor, with Control Applications

被引:0
|
作者
Paczolay, Gabor [1 ]
Harmati, Istvan [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Control Engn, Magyar tudosok krt 2,1 bldg, H-1117 Budapest, Hungary
关键词
reinforcement learning; DQN; NPV; NPV-DQN;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Discount factor plays an important role in reinforcement learning algorithms. It decides how much future rewards are valued for the present time-step. In this paper, a system with a Q value estimation, based on two distinct discount factors are utilized. These estimations can later be merged into one network, to make the computations more efficient. The decision of which network to use, is based on the relative value of the maximum value of the short-term network, the more unambiguous the maximum is, the more probability is rendered to the selection of that network. The system is then benchmarked, on a cartpole and a gridworld environment.
引用
收藏
页码:175 / 190
页数:16
相关论文
共 50 条
  • [41] Special Issue on Aerospace and Mechanical Applications of Reinforcement Learning and Adaptive Learning Based Control
    How, Jonathan P.
    Chowdhary, Girish
    Walsh, Thomas
    JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2014, 11 (09): : 541 - 541
  • [42] Modeling of route planning system based on Q value-based dynamic programming with multi-agent reinforcement learning algorithms
    Zolfpour-Arokhlo, Mortaza
    Selamat, Ali
    Hashim, Siti Zaiton Mohd
    Afkhami, Hossein
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 29 : 163 - 177
  • [43] Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning
    Wu, Qizhen
    Liu, Kexin
    Chen, Lei
    IEEE INTELLIGENT SYSTEMS, 2024, 39 (03) : 63 - 72
  • [44] Variable Admittance Control Based on Fuzzy Reinforcement Learning for Minimally Invasive Surgery Manipulator
    Du, Zhijiang
    Wang, Wei
    Yan, Zhiyuan
    Dong, Wei
    Wang, Weidong
    SENSORS, 2017, 17 (04)
  • [45] A Comparison of Different State Representations for Reinforcement Learning Based Variable Speed Limit Control
    Kusic, Kresimir
    Ivanjko, Edouard
    Greguric, Martin
    2018 26TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2018, : 266 - 271
  • [46] Reinforcement Learning-based Path Following Control for a Vehicle with Variable Delay in the Drivetrain
    Ultsch, Johannes
    Mirwald, Jonas
    Brembeck, Jonathan
    de Castro, Ricardo
    2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 532 - 539
  • [47] Deep Reinforcement Learning-Based Optimal Control of Variable Cycle Engine Performance
    Tao, Bo
    Yang, Li-Ying
    Wu, Dong-Sheng
    Li, Si-Liang
    Huang, Zhao-Xiong
    Sun, Xiao-Shu
    2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 1002 - 1005
  • [48] Interactions Among Working Memory, Reinforcement Learning, and Effort in Value-Based Choice: A New Paradigm and Selective Deficits in Schizophrenia
    Collins, Anne G. E.
    Albrecht, Matthew A.
    Waltz, James A.
    Gold, James M.
    Frank, Michael J.
    BIOLOGICAL PSYCHIATRY, 2017, 82 (06) : 431 - 439
  • [49] Deep Reinforcement Learning-based Edge Caching for Industrial Control Applications
    Zhang, Lei
    Xu, Hao
    Wang Guilin
    Yan, Wang
    Wang, Xiaojun
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 5024 - 5029
  • [50] IMPROVING THE SCALABILITY OF DEEP REINFORCEMENT LEARNING-BASED ROUTING WITH CONTROL ON PARTIAL NODES
    Sun, Penghao
    Lan, Julong
    Guo, Zehua
    Xu, Yang
    Hu, Yuxiang
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3557 - 3561