Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA

被引:36
|
作者
Da Silva, Lucileide M. D. [1 ]
Torquato, Matheus F. [2 ]
Fernandes, Marcelo A. C. [3 ]
机构
[1] Fed Inst Rio Grande do Norte, Dept Comp Sci & Technol, BR-59200000 Santa Cruz, Brazil
[2] Swansea Univ, Coll Engn, Swansea SA2 8PP, W Glam, Wales
[3] Univ Fed Rio Grande do Norte, Dept Comp Engn & Automat, BR-59078970 Natal, RN, Brazil
关键词
FPGA; Q-learning; reinforcement learning; reconfigurable computing; HARDWARE; ARCHITECTURE; NETWORK;
D O I
10.1109/ACCESS.2018.2885950
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Q-learning is an off-policy reinforcement learning technique, which has the main advantage of obtaining an optimal policy interacting with an unknown model environment. This paper proposes a parallel fixed-point Q-learning algorithm architecture implemented on field programmable gate arrays (FPGA) focusing on optimizing the system processing time. The convergence results are presented, and the processing time and occupied area were analyzed for different states and actions sizes scenarios and various fixed-point formats. The studies concerning the accuracy of the Q-learning technique response and resolution error associated with a decrease in the number of bits were also carried out for hardware implementation. The architecture implementation details were featured. The entire project was developed using the system generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA.
引用
收藏
页码:2782 / 2798
页数:17
相关论文
共 50 条
  • [21] Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
    Weissenbacher, Matthias
    Sinha, Samarth
    Garg, Animesh
    Kawahara, Yoshinobu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [22] Comparing NARS and Reinforcement Learning: An Analysis of ONA and Q-Learning Algorithms
    Beikmohammadi, Ali
    Magnusson, Sindri
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2023, 2023, 13921 : 21 - 31
  • [23] Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning
    Graham, Caoimhin
    Bell, David
    Luo, Zhihui
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 293 - 298
  • [24] Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning
    Wilson, Callum
    Riccardi, Annalisa
    OPTIMIZATION AND ENGINEERING, 2023, 24 (01) : 223 - 255
  • [25] Autonomous Driving in Roundabout Maneuvers Using Reinforcement Learning with Q-Learning
    Garcia Cuenca, Laura
    Puertas, Enrique
    Fernandez Andres, Javier
    Aliane, Nourdine
    ELECTRONICS, 2019, 8 (12)
  • [26] An inverse reinforcement learning framework with the Q-learning mechanism for the metaheuristic algorithm
    Zhao, Fuqing
    Wang, Qiaoyun
    Wang, Ling
    KNOWLEDGE-BASED SYSTEMS, 2023, 265
  • [27] Improving the efficiency of reinforcement learning for a spacecraft powered descent with Q-learning
    Callum Wilson
    Annalisa Riccardi
    Optimization and Engineering, 2023, 24 : 223 - 255
  • [28] Efficient implementation of dynamic fuzzy Q-learning
    Deng, C
    Er, MJ
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1854 - 1858
  • [29] Implementation of fuzzy Q-learning for a soccer agent
    Nakashima, T
    Udo, M
    Ishibuchi, H
    PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 533 - 536
  • [30] Reinforcement distribution in a team of cooperative Q-learning agents
    Abbasi, Zahra
    Abbasi, Mohammad Ali
    PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 154 - +