Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

被引：0

作者：

Sunel, Saim ^{[1
]}

Cilden, Erkin ^{[2
]}

Polat, Faruk ^{[1
]}

机构：

[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

[2] STM Def Technol Engn & Trade Inc, RF & Simulat Syst Directorate, Ankara, Turkiye

来源：

ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS | 2024年 / 19卷 / 02期

关键词：

Subgoal identification; expectation-maximization; diverse density; hyper-parameter search; multiple instance learning; reinforcement learning; DISCOVERY; FRAMEWORK;

D O I：

10.1145/3643852

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Variousmethods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, first, we introduce a newMIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL's Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Second, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Third, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.

引用

页数：29

共 50 条

[21] XTuning: Expert Database Tuning System Based on Reinforcement Learning
Chai, Yanfeng
Ge, Jiake
Chai, Yunpeng
Wang, Xin
Zhao, BoXuan
WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT I, 2021, 13080 : 101 - 110
[22] Distributed GAN: Toward a Faster Reinforcement-Learning-Based Architecture Search
Shi, Jiachen
Fan, Yi
Zhou, Guoqiang
Shen, Jun
IEEE Transactions on Artificial Intelligence, 2022, 3 (03): : 391 - 401
[23] FTPSG: Feature mixture transformer and potential-based subgoal generation for hierarchical multi-agent reinforcement learning
Nicholaus, Isack Thomas
Kang, Dae-Ki
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 270
[24] Reinforcement Learning Based PID Parameter Tuning and Estimation for Multirotor UAVs
Sonmez, Serhat
Martini, Simone
Rutherford, Mathew J.
Valavanis, Kimon P.
2024 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS, ICUAS, 2024, : 1224 - 1231
[25] A Model-Based Reinforcement Learning Approach for Robust PID Tuning
Jesawada, Hozefa
Yerudkar, Amol
Del Vecchio, Carmen
Singh, Navdeep
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 1466 - 1471
[26] Autopilot parameter rapid tuning method based on deep reinforcement learning
Wan Q.
Lu B.
Zhao Y.
Wen Q.
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2022, 44 (10): : 3190 - 3199
[27] Reinforcement learning and tuning for neural network based fuzzy logic controller
Wu, GF
Sun, HJ
Dong, JQ
Cao, M
Wang, T
2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 1695 - 1700
[28] Reinforcement-Learning-based Miniature UAV Identification
She Xiaoyu
Guan Zhenyu
Mao Ruizhi
Li Jie
Yang Chengwei
PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON UNMANNED SYSTEMS (ICUS), 2017, : 237 - 242
[29] Parameters tuning of multi-model database based on deep reinforcement learning
Ye, Feng
Li, Yang
Wang, Xiwen
Nedjah, Nadia
Zhang, Peng
Shi, Hong
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 61 (01) : 167 - 190
[30] A Comparison of Classical and Reinforcement Learning-based Tuning Techniques for PI controllers
Abad-Alcaraz, V
Castilla, M.
Alvarez, J. D.
IFAC PAPERSONLINE, 2024, 58 (07): : 180 - 185

← 1 2 3 4 5 →