Efficient Q-learning hyperparameter tuning using FOX optimization algorithm

被引:0
|
作者
Jumaah, Mahmood A. [1 ]
Ali, Yossra H. [1 ]
Rashid, Tarik A. [2 ]
机构
[1] Univ Technol Iraq, Dept Comp Sci, Al Sinaa St, Baghdad 10066, Iraq
[2] Univ Kurdistan Hewler, Dept Comp Sci & Engn, 30 Meter Ave, Erbil 44001, Iraq
关键词
FOX optimization algorithm; Hyperparameter; Optimization; Q-learning; Reinforcement learning;
D O I
10.1016/j.rineng.2025.104341
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Reinforcement learning is a branch of artificial intelligence in which agents learn optimal actions through interactions with their environment. Hyperparameter tuning is crucial for optimizing reinforcement learning algorithms and involves the selection of parameters that can significantly impact learning performance and reward. Conventional Q-learning relies on fixed hyperparameter without tuning throughout the learning process, which is sensitive to the outcomes and can hinder optimal performance. In this paper, a new adaptive hyperparameter tuning method, called Q-learning-FOX (Q-FOX), is proposed. This method utilizes the FOX Optimizer-an optimization algorithm inspired by the hunting behaviour of red foxes-to adaptively optimize the learning rate (alpha) and discount factor (gamma) in the Q-learning. Furthermore, a novel objective function is proposed that maximizes the average Q-values. The FOX utilizes this function to select the optimal solutions with maximum fitness, thereby enhancing the optimization process. The effectiveness of the proposed method is demonstrated through evaluations conducted on two OpenAI Gym control tasks: Cart Pole and Frozen Lake. The proposed method achieved superior cumulative reward compared to established optimization algorithms, as well as fixed and random hyperparameter tuning methods. The fixed and random methods represent the conventional Qlearning. However, the proposed Q-FOX method consistently achieved an average cumulative reward of 500 (the maximum possible) for the Cart Pole task and 0.7389 for the Frozen Lake task across 30 independent runs, demonstrating a 23.37% higher average cumulative reward than conventional Q-learning, which uses established optimization algorithms in both control tasks. Ultimately, the study demonstrates that Q-FOX is superior to tuning hyperparameters adaptively in Q-learning, outperforming established methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Hyperparameter Tuning for Deep Neural Networks Based Optimization Algorithm
    Vidyabharathi, D.
    Mohanraj, V.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03): : 2559 - 2573
  • [42] Proposal for improvement of GRASP metaheuristic and Genetic Algorithm using the Q-Learning Algorithm
    de Lima, Francisco Chagas, Jr.
    de Melo, Jorge D.
    Neto, Adriao Duarte D.
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 465 - +
  • [43] Optima Q-Learning Approach for Tuning the Cavity Filters
    Sekhri, Even
    Tamre, Mart
    Kapoor, Rajiv
    PROCEEDINGS OF THE 2019 20TH INTERNATIONAL CONFERENCE ON RESEARCH AND EDUCATION IN MECHATRONICS (REM 2019), 2019,
  • [44] Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning
    Er, MJ
    Deng, C
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2004, 34 (03): : 1478 - 1489
  • [45] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
    Ghazanfari, Behzad
    Mozayani, Nasser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
  • [46] Incorporating Q-learning and gradient search scheme into JAYA algorithm for global optimization
    Lingyun Deng
    Sanyang Liu
    Artificial Intelligence Review, 2023, 56 : 3705 - 3748
  • [47] Q-learning whale optimization algorithm for test suite generation with constraints support
    Hassan, Ali Abdullah
    Abdullah, Salwani
    Zamli, Kamal Z.
    Razali, Rozilawati
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (34): : 24069 - 24090
  • [48] Incorporating Q-learning and gradient search scheme into JAYA algorithm for global optimization
    Deng, Lingyun
    Liu, Sanyang
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL3) : S3705 - S3748
  • [49] A Service-Centric Q-Learning Algorithm for Mobility Robustness Optimization in LTE
    Luisa Mari-Altozano, Maria
    Mwanje, Stephen S.
    Luna Ramirez, Salvador
    Toril, Matias
    Sanneck, Henning
    Gijon, Carolina
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (03): : 3541 - 3555
  • [50] Combination Optimization Model of Urban Key Intersections Based on Q-Learning Algorithm
    Dong, Dan-Ping
    Wei, Fu-Lu
    Chen, Ming-Tao
    Guo, Yong-Qing
    Yang, Chang-Hai
    Han, Yu-Xin
    CICTP 2023: INNOVATION-EMPOWERED TECHNOLOGY FOR SUSTAINABLE, INTELLIGENT, DECARBONIZED, AND CONNECTED TRANSPORTATION, 2023, : 849 - 859