Efficient Q-learning hyperparameter tuning using FOX optimization algorithm

被引:0
|
作者
Jumaah, Mahmood A. [1 ]
Ali, Yossra H. [1 ]
Rashid, Tarik A. [2 ]
机构
[1] Univ Technol Iraq, Dept Comp Sci, Al Sinaa St, Baghdad 10066, Iraq
[2] Univ Kurdistan Hewler, Dept Comp Sci & Engn, 30 Meter Ave, Erbil 44001, Iraq
关键词
FOX optimization algorithm; Hyperparameter; Optimization; Q-learning; Reinforcement learning;
D O I
10.1016/j.rineng.2025.104341
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Reinforcement learning is a branch of artificial intelligence in which agents learn optimal actions through interactions with their environment. Hyperparameter tuning is crucial for optimizing reinforcement learning algorithms and involves the selection of parameters that can significantly impact learning performance and reward. Conventional Q-learning relies on fixed hyperparameter without tuning throughout the learning process, which is sensitive to the outcomes and can hinder optimal performance. In this paper, a new adaptive hyperparameter tuning method, called Q-learning-FOX (Q-FOX), is proposed. This method utilizes the FOX Optimizer-an optimization algorithm inspired by the hunting behaviour of red foxes-to adaptively optimize the learning rate (alpha) and discount factor (gamma) in the Q-learning. Furthermore, a novel objective function is proposed that maximizes the average Q-values. The FOX utilizes this function to select the optimal solutions with maximum fitness, thereby enhancing the optimization process. The effectiveness of the proposed method is demonstrated through evaluations conducted on two OpenAI Gym control tasks: Cart Pole and Frozen Lake. The proposed method achieved superior cumulative reward compared to established optimization algorithms, as well as fixed and random hyperparameter tuning methods. The fixed and random methods represent the conventional Qlearning. However, the proposed Q-FOX method consistently achieved an average cumulative reward of 500 (the maximum possible) for the Cart Pole task and 0.7389 for the Frozen Lake task across 30 independent runs, demonstrating a 23.37% higher average cumulative reward than conventional Q-learning, which uses established optimization algorithms in both control tasks. Ultimately, the study demonstrates that Q-FOX is superior to tuning hyperparameters adaptively in Q-learning, outperforming established methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
    Li, Ying
    Wang, Hanyu
    Fan, Jiahao
    Geng, Yanyu
    PLOS ONE, 2022, 17 (12):
  • [32] An analysis of the pheromone Q-learning algorithm
    Monekosso, N
    Remagnino, P
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 224 - 232
  • [33] A Weighted Smooth Q-Learning Algorithm
    Vijesh, V. Antony
    Shreyas, S. R.
    IEEE CONTROL SYSTEMS LETTERS, 2025, 9 : 21 - 26
  • [34] An improved immune Q-learning algorithm
    Ji, Zhengqiao
    Wu, Q. M. Jonathan
    Sid-Ahmed, Maher
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
  • [35] LTE Handover Parameters Optimization Using Q-Learning Technique
    Abdelmohsen, Assem
    Abdelwahab, Mohamed
    Adel, Mohamed
    Darweesh, M. Saeed
    Mostafa, Hassan
    2018 IEEE 61ST INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2018, : 194 - 197
  • [36] Optimization of Handover Problem Using Q-Learning for LTE Network
    Adel, Mohamed
    Darweesh, M. Saeed
    Mostafa, Hassan
    Kamal, Hanan
    El-Ghoneimy, Mona
    2018 30TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS (ICM), 2018, : 188 - 191
  • [37] Energy Efficient Path Planning Scheme for Unmanned Aerial Vehicle Using Hybrid Generic Algorithm-Based Q-Learning Optimization
    Saeed, Rashid A.
    Ali, Elmustafa Sayed
    Abdelhaq, Maha
    Alsaqour, Raed
    Ahmed, Fatima Rayan Awad
    Saad, Asma Mohammed Elbashir
    IEEE ACCESS, 2024, 12 : 13400 - 13417
  • [38] Efficient hyperparameter tuning for predicting student performance with Bayesian optimization
    Saleh Albahli
    Multimedia Tools and Applications, 2024, 83 : 52711 - 52735
  • [40] Efficient hyperparameter tuning for kernel ridge regression with Bayesian optimization
    Stuke, Annika
    Rinke, Patrick
    Todorovic, Milica
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (03):