NPV-DQN: Improving Value-based Reinforcement Learning, by Variable Discount Factor, with Control Applications

被引：0

作者：

Paczolay, Gabor ^{[1
]}

Harmati, Istvan ^{[1
]}

机构：

[1] Budapest Univ Technol & Econ, Dept Control Engn, Magyar tudosok krt 2,1 bldg, H-1117 Budapest, Hungary

来源：

ACTA POLYTECHNICA HUNGARICA | 2024年 / 21卷 / 11期

关键词：

reinforcement learning; DQN; NPV; NPV-DQN;

D O I：

暂无

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Discount factor plays an important role in reinforcement learning algorithms. It decides how much future rewards are valued for the present time-step. In this paper, a system with a Q value estimation, based on two distinct discount factors are utilized. These estimations can later be merged into one network, to make the computations more efficient. The decision of which network to use, is based on the relative value of the maximum value of the short-term network, the more unambiguous the maximum is, the more probability is rendered to the selection of that network. The system is then benchmarked, on a cartpole and a gridworld environment.

引用

页码：175 / 190

页数：16

共 50 条

[41] Special Issue on Aerospace and Mechanical Applications of Reinforcement Learning and Adaptive Learning Based Control
How, Jonathan P.
Chowdhary, Girish
Walsh, Thomas
JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2014, 11 (09): : 541 - 541
[42] Modeling of route planning system based on Q value-based dynamic programming with multi-agent reinforcement learning algorithms
Zolfpour-Arokhlo, Mortaza
Selamat, Ali
Hashim, Siti Zaiton Mohd
Afkhami, Hossein
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 29 : 163 - 177
[43] Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning
Wu, Qizhen
Liu, Kexin
Chen, Lei
IEEE INTELLIGENT SYSTEMS, 2024, 39 (03) : 63 - 72
[44] Variable Admittance Control Based on Fuzzy Reinforcement Learning for Minimally Invasive Surgery Manipulator
Du, Zhijiang
Wang, Wei
Yan, Zhiyuan
Dong, Wei
Wang, Weidong
SENSORS, 2017, 17 (04)
[45] A Comparison of Different State Representations for Reinforcement Learning Based Variable Speed Limit Control
Kusic, Kresimir
Ivanjko, Edouard
Greguric, Martin
2018 26TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2018, : 266 - 271
[46] Reinforcement Learning-based Path Following Control for a Vehicle with Variable Delay in the Drivetrain
Ultsch, Johannes
Mirwald, Jonas
Brembeck, Jonathan
de Castro, Ricardo
2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 532 - 539
[47] Deep Reinforcement Learning-Based Optimal Control of Variable Cycle Engine Performance
Tao, Bo
Yang, Li-Ying
Wu, Dong-Sheng
Li, Si-Liang
Huang, Zhao-Xiong
Sun, Xiao-Shu
2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 1002 - 1005
[48] Interactions Among Working Memory, Reinforcement Learning, and Effort in Value-Based Choice: A New Paradigm and Selective Deficits in Schizophrenia
Collins, Anne G. E.
Albrecht, Matthew A.
Waltz, James A.
Gold, James M.
Frank, Michael J.
BIOLOGICAL PSYCHIATRY, 2017, 82 (06) : 431 - 439
[49] Deep Reinforcement Learning-based Edge Caching for Industrial Control Applications
Zhang, Lei
Xu, Hao
Wang Guilin
Yan, Wang
Wang, Xiaojun
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 5024 - 5029
[50] IMPROVING THE SCALABILITY OF DEEP REINFORCEMENT LEARNING-BASED ROUTING WITH CONTROL ON PARTIAL NODES
Sun, Penghao
Lan, Julong
Guo, Zehua
Xu, Yang
Hu, Yuxiang
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3557 - 3561

← 1 2 3 4 5 →