Cross-domain policy adaptation with dynamics alignment

被引:2
|
作者
Gui, Haiyuan [1 ]
Pang, Shanchen [1 ]
Yu, Shihang [2 ]
Qiao, Sibo [1 ]
Qi, Yufeng [1 ]
He, Xiao [1 ]
Wang, Min [3 ]
Zhai, Xue [1 ]
机构
[1] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao, Peoples R China
[2] Tiangong Univ, Sch Mech Engn, Tianjin, Peoples R China
[3] China Univ Petr East China, Coll Control Sci & Engn, Qingdao, Peoples R China
关键词
Reinforcement learning; Policy transfer; Cross domain; Reward function; Continuous control; REINFORCEMENT; NETWORK; MODEL;
D O I
10.1016/j.neunet.2023.08.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The implementation of robotic reinforcement learning is hampered by problems such as an unspecified reward function and high training costs. Many previous works have used cross-domain policy transfer to obtain the policy of the problem domain. However, these researches require paired and aligned dynamics trajectories or other interactions with the environment. We propose a cross-domain dynamics alignment framework for the problem domain policy acquisition that can transfer the policy trained in the source domain to the problem domain. Our framework aims to learn dynamics alignment across two domains that differ in agents' physical parameters (armature, rotation range, or torso mass) or agents' morphologies (limbs). Most importantly, we learn dynamics alignment between two domains using unpaired and unaligned dynamics trajectories. For these two scenarios, we propose a crossphysics-domain policy adaptation algorithm (CPD) and a cross-morphology-domain policy adaptation algorithm (CMD) based on our cross-domain dynamics alignment framework. In order to improve the performance of policy in the source domain so that a better policy can be transferred to the problem domain, we propose the Boltzmann TD3 (BTD3) algorithm. We conduct diverse experiments on agent continuous control domains to demonstrate the performance of our approaches. Experimental results show that our approaches can obtain better policies and higher rewards for the agents in the problem domains even when the dataset of the problem domain is small.
引用
收藏
页码:104 / 117
页数:14
相关论文
共 50 条
  • [31] Unsupervised Domain Adaptation for Cross-domain Histopathology Image Classification
    Li, Xiangning
    Pan, Chen
    He, Lingmin
    Li, Xinyu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 23311 - 23331
  • [32] Cross-domain recommender systems via multimodal domain adaptation
    Shyam, Adamya
    Kamani, Ramya
    Kagita, Venkateswara Rao
    Kumar, Vikas
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123
  • [33] Cross-domain structure preserving projection for heterogeneous domain adaptation
    Wang, Qian
    Breckon, Toby P.
    PATTERN RECOGNITION, 2022, 123
  • [34] Cross-Domain Graph Convolutions for Adversarial Unsupervised Domain Adaptation
    Zhu, Ronghang
    Jiang, Xiaodong
    Lu, Jiasen
    Li, Sheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 3847 - 3858
  • [35] Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation
    Du, Zhekai
    Li, Jingjing
    Su, Hongzu
    Zhu, Lei
    Lu, Ke
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3936 - 3945
  • [36] Unsupervised Domain Adaptation for Cross-domain Histopathology Image Classification
    Li, Xiangning
    Pan, Chen
    He, Lingmin
    Li, Xinyu
    Multimedia Tools and Applications, 2024, 83 (08) : 23311 - 23331
  • [37] FeatureTransfer: Unsupervised Domain Adaptation for Cross-Domain Deepfake Detection
    Chen, Baoying
    Tan, Shunquan
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [38] Cross-domain recommendation based on latent factor alignment
    Yu, Xu
    Hu, Qiang
    Li, Hui
    Du, Junwei
    Gao, Jia
    Sun, Lijun
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (05): : 3421 - 3432
  • [39] Low-dimensional Alignment for Cross-Domain Recommendation
    Wang, Tianxin
    Zhuang, Fuzhen
    Zhang, Zhiqiang
    Wang, Daixin
    Zhou, Jun
    He, Qing
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3508 - 3512
  • [40] Cross-Domain Recommendation via Progressive Structural Alignment
    Zhao, Chuang
    Zhao, Hongke
    Li, Xiaomeng
    He, Ming
    Wang, Jiahui
    Fan, Jianping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (06) : 2401 - 2415