Integrating guidance into relational reinforcement learning

被引：55

作者：

Driessens, K

Dzeroski, S

机构：

[1] Katholieke Univ Leuven, Dept Comp Sci, B-3001 Heverlee, Belgium

[2] Jozef Stefan Inst, Dept Intelligent Syst, SI-1000 Ljubljana, Slovenia

来源：

MACHINE LEARNING | 2004年 / 57卷 / 03期

关键词：

reinforcement learning; relational learning; guided exploration;

D O I：

10.1023/B:MACH.0000039779.47329.3a

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning, and Q-learning in particular, encounter two major problems when dealing with large state spaces. First, learning the Q-function in tabular form may be infeasible because of the excessive amount of memory needed to store the table, and because the Q-function only converges after each state has been visited multiple times. Second, rewards in the state space may be so sparse that with random exploration they will only be discovered extremely slowly. The first problem is often solved by learning a generalization of the encountered examples ( e. g., using a neural net or decision tree). Relational reinforcement learning (RRL) is such an approach; it makes Q-learning feasible in structural domains by incorporating a relational learner into Q-learning. The problem of sparse rewards has not been addressed for RRL. This paper presents a solution based on the use of "reasonable policies" to provide guidance. Different types of policies and different strategies to supply guidance through these policies are discussed and evaluated experimentally in several relational domains to show the merits of the approach.

引用

页码：271 / 304

页数：34

共 50 条

[31] Algebraic Reinforcement Learning Hypothesis Induction for Relational Reinforcement Learning Using Term Generalization
Neubert, Stefanie
Belzner, Lenz
Wirsing, Martin
LOGIC, REWRITING, AND CONCURRENCY, 2015, 9200 : 562 - 579
[32] Integrating Reinforcement Learning into a Programming Language
Simpkins, Christopher L.
PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1996 - 1997
[33] Hierarchical reinforcement learning guidance with threat avoidance
Li Bohao
Wu Yunjie
Li Guofei
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2022, 33 (05) : 1173 - 1185
[34] Bitrate Adaptation and Guidance With Meta Reinforcement Learning
Bentaleb, Abdelhak
Lim, May
Akcay, Mehmet N.
Begen, Ali C.
Zimmermann, Roger
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (11) : 10378 - 10392
[35] Hierarchical reinforcement learning guidance with threat avoidance
LI Bohao
WU Yunjie
LI Guofei
JournalofSystemsEngineeringandElectronics, 2022, 33 (05) : 1173 - 1185
[36] Probabilistic Counterexample Guidance for Safer Reinforcement Learning
Ji, Xiaotong
Filieri, Antonio
QUANTITATIVE EVALUATION OF SYSTEMS, QEST 2023, 2023, 14287 : 311 - 328
[37] Semantic Guidance of Dialogue Generation with Reinforcement Learning
Hsueh, Cheng-Hsun
Ma, Wei-Yun
SIGDIAL 2020: 21ST ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2020), 2020, : 1 - 9
[38] Relational Reinforcement Learning: A Logic Programming based approach
Preda, Mircea Cezar
ANNALS OF THE UNIVERSITY OF CRAIOVA-MATHEMATICS AND COMPUTER SCIENCE SERIES, 2007, 34 : 124 - 132
[39] Heterogeneous relational reasoning in knowledge graphs with reinforcement learning
Saebi, Mandana
Kreig, Steven
Zhang, Chuxu
Jiang, Meng
Kajdanowicz, Tomasz
Chawla, Nitesh, V
INFORMATION FUSION, 2022, 88 : 12 - 21
[40] Visual Navigation via Reinforcement Learning and Relational Reasoning
Zhou, Kang
Guo, Chi
Zhang, Huyin
2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021), 2021, : 131 - 138

← 1 2 3 4 5 →