A substructure transfer reinforcement learning method based on metric learning

被引:0
|
作者
Chai, Peihua [1 ,2 ]
Chen, Bilian [1 ,2 ]
Zeng, Yifeng [3 ]
Yu, Shenbao [4 ]
机构
[1] Xiamen Univ, Sch Aerosp Engn, Dept Automat, Xiamen 361005, Peoples R China
[2] Xiamen Key Lab Big Data Intelligent Anal & Decis M, Xiamen 361005, Peoples R China
[3] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, England
[4] Fujian Normal Univ, Coll Comp & Cyber Secur, Fuzhou 350108, Peoples R China
基金
中国国家自然科学基金;
关键词
Transfer learning; Reinforcement learning; Distance measure; Markov decision process;
D O I
10.1016/j.neucom.2024.128071
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transfer reinforcement learning has gained significant traction in recent years as a critical research area, focusing on bolstering agents' decision-making prowess by harnessing insights from analogous tasks. The primary transfer learning method involves identifying the appropriate source domains, sharing specific knowledge structures and subsequently transferring the shared knowledge to novel tasks. However, existing transfer methods exhibit a pronounced dependency on high task similarity and an abundance of source data. Consequently, we attempt to formulate a more efficacious approach that optimally exploits the previous learning experiences to direct an agent's exploration as it learns new tasks. Specifically, we introduce a novel transfer learning paradigm rooted within the distance measure in the Markov chain, denoted as Distance Measure Substructure Transfer Reinforcement Learning (DMS-TRL). The core idea involves partitioning the Markov chain into the most basic small Markov units, which contain basic information about the agent's transfer between two states, and then followed by employing a new distance measure technique to find the most similar structure, which is also the most suitable for transfer. Finally, we propose a policy transfer method to transfer knowledge through the Q table from the selected Markov unit to the target task. Through a series of experiments conducted on discrete Gridworld scenarios, we compare our approach with state-of-the-art learning methods. The results clearly illustrate that DMS-TRL can adeptly identify optimal policy in target tasks, exhibiting swifter convergence.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Skill based transfer learning with domain adaptation for continuous reinforcement learning domains
    Farzaneh Shoeleh
    Masoud Asadpour
    Applied Intelligence, 2020, 50 : 502 - 518
  • [42] Skill based transfer learning with domain adaptation for continuous reinforcement learning domains
    Shoeleh, Farzaneh
    Asadpour, Masoud
    APPLIED INTELLIGENCE, 2020, 50 (02) : 502 - 518
  • [43] Client Selection Method for Federated Learning Based on Grouping Reinforcement Learning
    Li, Guo-ming
    Liu, Wai-xi
    Guo, Zhen-zheng
    Chen, Dao-xiao
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 327 - 332
  • [44] Transfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning
    Kono, Hitoshi
    Kamimura, Akiya
    Tomita, Kohji
    Murata, Yuta
    Suzuki, Tsuyoshi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (10) : 156 - 164
  • [45] Transfer reinforcement learning method with multi-label learning for compound fault recognition
    Wang, Zisheng
    Zhang, Qing
    Tang, Lv
    Shi, Tielin
    Xuan, Jianping
    ADVANCED ENGINEERING INFORMATICS, 2023, 55
  • [46] Soft sensor modeling method for Pichia pastoris fermentation process based on substructure domain transfer learning
    Wang, Bo
    Wei, Jun
    Zhang, Le
    Jiang, Hui
    Jin, Cheng
    Huang, Shaowen
    BMC BIOTECHNOLOGY, 2024, 24 (01)
  • [47] Transfer Reinforcement Learning Based Negotiating Agent Framework
    Chen, Siqi
    Yang, Tianpei
    You, Heng
    Zhao, Jianing
    Hao, Jianye
    Weiss, Gerhard
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 386 - 397
  • [48] Provably adaptive reinforcement learning in metric spaces
    Cao, Tongyi
    Krishnamurthy, Akshay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [49] A Target- Aware Well Path Control Method Based on Transfer Reinforcement Learning
    Zhu, Dandan
    Xu, Qiuhan
    Wang, Fei
    Chen, Dong
    Ye, Zhihui
    Zhou, Hao
    Zhang, Ke
    SPE JOURNAL, 2024, 29 (04): : 1730 - 1741
  • [50] A reinforcement transfer learning method based on a policy gradient for rolling bearing fault diagnosis
    Wang, Ruixin
    Jiang, Hongkai
    Wu, Zhenghong
    Xu, Jun
    Zhang, Jianjun
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (06)