Efficient Q-learning hyperparameter tuning using FOX optimization algorithm

被引:0
|
作者
Jumaah, Mahmood A. [1 ]
Ali, Yossra H. [1 ]
Rashid, Tarik A. [2 ]
机构
[1] Univ Technol Iraq, Dept Comp Sci, Al Sinaa St, Baghdad 10066, Iraq
[2] Univ Kurdistan Hewler, Dept Comp Sci & Engn, 30 Meter Ave, Erbil 44001, Iraq
关键词
FOX optimization algorithm; Hyperparameter; Optimization; Q-learning; Reinforcement learning;
D O I
10.1016/j.rineng.2025.104341
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Reinforcement learning is a branch of artificial intelligence in which agents learn optimal actions through interactions with their environment. Hyperparameter tuning is crucial for optimizing reinforcement learning algorithms and involves the selection of parameters that can significantly impact learning performance and reward. Conventional Q-learning relies on fixed hyperparameter without tuning throughout the learning process, which is sensitive to the outcomes and can hinder optimal performance. In this paper, a new adaptive hyperparameter tuning method, called Q-learning-FOX (Q-FOX), is proposed. This method utilizes the FOX Optimizer-an optimization algorithm inspired by the hunting behaviour of red foxes-to adaptively optimize the learning rate (alpha) and discount factor (gamma) in the Q-learning. Furthermore, a novel objective function is proposed that maximizes the average Q-values. The FOX utilizes this function to select the optimal solutions with maximum fitness, thereby enhancing the optimization process. The effectiveness of the proposed method is demonstrated through evaluations conducted on two OpenAI Gym control tasks: Cart Pole and Frozen Lake. The proposed method achieved superior cumulative reward compared to established optimization algorithms, as well as fixed and random hyperparameter tuning methods. The fixed and random methods represent the conventional Qlearning. However, the proposed Q-FOX method consistently achieved an average cumulative reward of 500 (the maximum possible) for the Cart Pole task and 0.7389 for the Frozen Lake task across 30 independent runs, demonstrating a 23.37% higher average cumulative reward than conventional Q-learning, which uses established optimization algorithms in both control tasks. Ultimately, the study demonstrates that Q-FOX is superior to tuning hyperparameters adaptively in Q-learning, outperforming established methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Hyperparameter optimization of neural networks based on Q-learning
    Qi, Xin
    Xu, Bing
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1669 - 1676
  • [2] Hyperparameter optimization of neural networks based on Q-learning
    Xin Qi
    Bing Xu
    Signal, Image and Video Processing, 2023, 17 : 1669 - 1676
  • [3] Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning
    Dong, Xingping
    Shen, Jianbing
    Wang, Wenguan
    Liu, Yu
    Shao, Ling
    Porikli, Fatih
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 518 - 527
  • [4] Efficient Deep Learning Hyperparameter Tuning using Cloud Infrastructure Intelligent Distributed Hyperparameter tuning with Bayesian Optimization in the Cloud
    Ranjit, Mercy Prasanna
    Ganapathy, Gopinath
    Sridhar, Kalaivani
    Arumugham, Vikram
    2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019), 2019, : 520 - 522
  • [5] Energy Optimization of a Base Station using Q-learning Algorithm
    Aggarwal, Anisha
    Selvamuthu, Dharmaraja
    2023 17TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, CONTEL, 2023,
  • [6] An Efficient Multimodal Emotion Identification Using FOX Optimized Double Deep Q-Learning
    R. Selvi
    C. Vijayakumaran
    Wireless Personal Communications, 2023, 132 (4) : 2387 - 2406
  • [7] An Efficient Multimodal Emotion Identification Using FOX Optimized Double Deep Q-Learning
    Selvi, R.
    Vijayakumaran, C.
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 132 (04) : 2387 - 2406
  • [8] Hyperparameter Optimization for the LSTM Method of AUV Model Identification Based on Q-Learning
    Wang, Dianrui
    Wan, Junhe
    Shen, Yue
    Qin, Ping
    He, Bo
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (08)
  • [9] An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm
    Spano, Sergio
    Cardarilli, Gian Carlo
    Di Nunzio, Luca
    Fazzolari, Rocco
    Giardino, Daniele
    Matta, Marco
    Nannarelli, Alberto
    Re, Marco
    IEEE ACCESS, 2019, 7 : 186340 - 186351
  • [10] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193