Efficient Q-learning hyperparameter tuning using FOX optimization algorithm

被引:0
|
作者
Jumaah, Mahmood A. [1 ]
Ali, Yossra H. [1 ]
Rashid, Tarik A. [2 ]
机构
[1] Univ Technol Iraq, Dept Comp Sci, Al Sinaa St, Baghdad 10066, Iraq
[2] Univ Kurdistan Hewler, Dept Comp Sci & Engn, 30 Meter Ave, Erbil 44001, Iraq
关键词
FOX optimization algorithm; Hyperparameter; Optimization; Q-learning; Reinforcement learning;
D O I
10.1016/j.rineng.2025.104341
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Reinforcement learning is a branch of artificial intelligence in which agents learn optimal actions through interactions with their environment. Hyperparameter tuning is crucial for optimizing reinforcement learning algorithms and involves the selection of parameters that can significantly impact learning performance and reward. Conventional Q-learning relies on fixed hyperparameter without tuning throughout the learning process, which is sensitive to the outcomes and can hinder optimal performance. In this paper, a new adaptive hyperparameter tuning method, called Q-learning-FOX (Q-FOX), is proposed. This method utilizes the FOX Optimizer-an optimization algorithm inspired by the hunting behaviour of red foxes-to adaptively optimize the learning rate (alpha) and discount factor (gamma) in the Q-learning. Furthermore, a novel objective function is proposed that maximizes the average Q-values. The FOX utilizes this function to select the optimal solutions with maximum fitness, thereby enhancing the optimization process. The effectiveness of the proposed method is demonstrated through evaluations conducted on two OpenAI Gym control tasks: Cart Pole and Frozen Lake. The proposed method achieved superior cumulative reward compared to established optimization algorithms, as well as fixed and random hyperparameter tuning methods. The fixed and random methods represent the conventional Qlearning. However, the proposed Q-FOX method consistently achieved an average cumulative reward of 500 (the maximum possible) for the Cart Pole task and 0.7389 for the Frozen Lake task across 30 independent runs, demonstrating a 23.37% higher average cumulative reward than conventional Q-learning, which uses established optimization algorithms in both control tasks. Ultimately, the study demonstrates that Q-FOX is superior to tuning hyperparameters adaptively in Q-learning, outperforming established methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Oil Production Optimization Using Q-Learning Approach
    Zahedi-Seresht, Mazyar
    Sadeghi Bigham, Bahram
    Khosravi, Shahrzad
    Nikpour, Hoda
    PROCESSES, 2024, 12 (01)
  • [22] Hyperparameter Tuning of the Shunt-murmur Discrimination Algorithm Using Bayesian Optimization
    Noda, Fumiya
    Nishijima, Keisuke
    Furuya, Ken'ichi
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 929 - 933
  • [23] Using Q-learning algorithm for initialization of the GRASP metaheuristic and genetic algorithm
    de Lima Junior, Francisco Chagas
    de Melo, Jorge Dantas
    Doria Neto, Adriao Duarte
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1243 - 1248
  • [24] Efficient Transfer Learning Method for Automatic Hyperparameter Tuning
    Yogatama, Dani
    Mann, Gideon
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 1077 - 1085
  • [25] Indoor Emergency Path Planning Based on the Q-Learning Optimization Algorithm
    Xu, Shenghua
    Gu, Yang
    Li, Xiaoyan
    Chen, Cai
    Hu, Yingyi
    Sang, Yu
    Jiang, Wenxing
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (01)
  • [26] A Q-Learning Based Energy Threshold Optimization Algorithm in LAA Networks
    Pei, Errong
    Zhou, Lineng
    Deng, Bingguang
    Lu, Xun
    Li, Yun
    Zhang, Zhizhong
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (07) : 7037 - 7049
  • [27] Thermal neutron beam optimization for PGNAA applications using Q-learning algorithm and neural network
    Mona Zolfaghari
    S. Farhad Masoudi
    Faezeh Rahmani
    Atefeh Fathi
    Scientific Reports, 12
  • [28] Thermal neutron beam optimization for PGNAA applications using Q-learning algorithm and neural network
    Zolfaghari, Mona
    Masoudi, S. Farhad
    Rahmani, Faezeh
    Fathi, Atefeh
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [29] Deep Learning on Active Sonar Data Using Bayesian Optimization for Hyperparameter Tuning
    Berg, Henrik
    Hjelmervik, Karl Thomas
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6546 - 6553
  • [30] ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
    Glowaty, Grzegorz
    COMPUTER SCIENCE-AGH, 2005, 7 : 77 - 87