Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA

被引：36

作者：

Da Silva, Lucileide M. D. ^{[1
]}

Torquato, Matheus F. ^{[2
]}

Fernandes, Marcelo A. C. ^{[3
]}

机构：

[1] Fed Inst Rio Grande do Norte, Dept Comp Sci & Technol, BR-59200000 Santa Cruz, Brazil

[2] Swansea Univ, Coll Engn, Swansea SA2 8PP, W Glam, Wales

[3] Univ Fed Rio Grande do Norte, Dept Comp Engn & Automat, BR-59078970 Natal, RN, Brazil

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

FPGA; Q-learning; reinforcement learning; reconfigurable computing; HARDWARE; ARCHITECTURE; NETWORK;

D O I：

10.1109/ACCESS.2018.2885950

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Q-learning is an off-policy reinforcement learning technique, which has the main advantage of obtaining an optimal policy interacting with an unknown model environment. This paper proposes a parallel fixed-point Q-learning algorithm architecture implemented on field programmable gate arrays (FPGA) focusing on optimizing the system processing time. The convergence results are presented, and the processing time and occupied area were analyzed for different states and actions sizes scenarios and various fixed-point formats. The studies concerning the accuracy of the Q-learning technique response and resolution error associated with a decrease in the number of bits were also carried out for hardware implementation. The architecture implementation details were featured. The entire project was developed using the system generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA.

引用

页码：2782 / 2798

页数：17

共 50 条

[21] Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
Weissenbacher, Matthias
Sinha, Samarth
Garg, Animesh
Kawahara, Yoshinobu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[22] Comparing NARS and Reinforcement Learning: An Analysis of ONA and Q-Learning Algorithms
Beikmohammadi, Ali
Magnusson, Sindri
ARTIFICIAL GENERAL INTELLIGENCE, AGI 2023, 2023, 13921 : 21 - 31
[23] Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning
Graham, Caoimhin
Bell, David
Luo, Zhihui
RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 293 - 298
[24] Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning
Wilson, Callum
Riccardi, Annalisa
OPTIMIZATION AND ENGINEERING, 2023, 24 (01) : 223 - 255
[25] Autonomous Driving in Roundabout Maneuvers Using Reinforcement Learning with Q-Learning
Garcia Cuenca, Laura
Puertas, Enrique
Fernandez Andres, Javier
Aliane, Nourdine
ELECTRONICS, 2019, 8 (12)
[26] An inverse reinforcement learning framework with the Q-learning mechanism for the metaheuristic algorithm
Zhao, Fuqing
Wang, Qiaoyun
Wang, Ling
KNOWLEDGE-BASED SYSTEMS, 2023, 265
[27] Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning
Callum Wilson
Annalisa Riccardi
Optimization and Engineering, 2023, 24 : 223 - 255
[28] Efficient implementation of dynamic fuzzy Q-learning
Deng, C
Er, MJ
ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1854 - 1858
[29] Implementation of fuzzy Q-learning for a soccer agent
Nakashima, T
Udo, M
Ishibuchi, H
PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 533 - 536
[30] Reinforcement distribution in a team of cooperative Q-learning agents
Abbasi, Zahra
Abbasi, Mohammad Ali
PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 154 - +

← 1 2 3 4 5 →