Cross-domain policy adaptation with dynamics alignment

被引:2
|
作者
Gui, Haiyuan [1 ]
Pang, Shanchen [1 ]
Yu, Shihang [2 ]
Qiao, Sibo [1 ]
Qi, Yufeng [1 ]
He, Xiao [1 ]
Wang, Min [3 ]
Zhai, Xue [1 ]
机构
[1] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao, Peoples R China
[2] Tiangong Univ, Sch Mech Engn, Tianjin, Peoples R China
[3] China Univ Petr East China, Coll Control Sci & Engn, Qingdao, Peoples R China
关键词
Reinforcement learning; Policy transfer; Cross domain; Reward function; Continuous control; REINFORCEMENT; NETWORK; MODEL;
D O I
10.1016/j.neunet.2023.08.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The implementation of robotic reinforcement learning is hampered by problems such as an unspecified reward function and high training costs. Many previous works have used cross-domain policy transfer to obtain the policy of the problem domain. However, these researches require paired and aligned dynamics trajectories or other interactions with the environment. We propose a cross-domain dynamics alignment framework for the problem domain policy acquisition that can transfer the policy trained in the source domain to the problem domain. Our framework aims to learn dynamics alignment across two domains that differ in agents' physical parameters (armature, rotation range, or torso mass) or agents' morphologies (limbs). Most importantly, we learn dynamics alignment between two domains using unpaired and unaligned dynamics trajectories. For these two scenarios, we propose a crossphysics-domain policy adaptation algorithm (CPD) and a cross-morphology-domain policy adaptation algorithm (CMD) based on our cross-domain dynamics alignment framework. In order to improve the performance of policy in the source domain so that a better policy can be transferred to the problem domain, we propose the Boltzmann TD3 (BTD3) algorithm. We conduct diverse experiments on agent continuous control domains to demonstrate the performance of our approaches. Experimental results show that our approaches can obtain better policies and higher rewards for the agents in the problem domains even when the dataset of the problem domain is small.
引用
收藏
页码:104 / 117
页数:14
相关论文
共 50 条
  • [1] Cross-Domain Relation Adaptation
    Kessler, Ido
    Lifshitz, Omri
    Benaim, Sagie
    Wolf, Lior
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [2] Graph Adaptation Network with Domain-Specific Word Alignment for Cross-Domain Relation Extraction
    Wang, Zhe
    Yan, Bo
    Wu, Chunhua
    Wu, Bin
    Wang, Xiujuan
    Zheng, Kangfeng
    SENSORS, 2020, 20 (24) : 1 - 23
  • [3] Unsupervised domain adaptation alignment method for cross-domain semantic segmentation of remote sensing images
    Shen Z.
    Ni H.
    Guan H.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2023, 52 (12): : 1 - 2
  • [4] BAPA-Net: Boundary Adaptation and Prototype Alignment for Cross-domain Semantic Segmentation
    Liu, Yahao
    Deng, Jinhong
    Gao, Xinchen
    Li, Wen
    Duan, Lixin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8781 - 8791
  • [5] Multi-source alignment domain adaptation with similarity measurement for cross-domain bearing fault diagnosis
    Xu, Yiyun
    Chen, Liang
    Zhang, Fusheng
    Wang, Shubei
    Shi, Juanjuan
    Shen, Changqing
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (05)
  • [6] Adversarial domain adaptation with classifier alignment for cross-domain intelligent fault diagnosis of multiple source domains
    Zhang, Yongchao
    Ren, Zhaohui
    Zhou, Shihua
    Yu, Tianzhuang
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2021, 32 (03)
  • [7] Graph Optimal Transport for Cross-Domain Alignment
    Chen, Liqun
    Gan, Zhe
    Cheng, Yu
    Li, Linjie
    Carin, Lawrence
    Liu, Jingjing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [8] Graph Optimal Transport for Cross-Domain Alignment
    Chen, Liqun
    Gan, Zhe
    Cheng, Yu
    Li, Linjie
    Carin, Lawrence
    Liu, Jingjing
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [9] Cross-Domain Policy Adaptation via Value-Guided Data Filtering
    Xu, Kang
    Bai, Chenjia
    Ma, Xiaoteng
    Wang, Dong
    Zhao, Bin
    Wang, Zhen
    Li, Xuelong
    Li, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Joint Cross-Domain Preserving and Distribution Adaptation for Heterogeneous Domain Adaptation
    Lekshmi, R.
    Sanodiya, Rakesh Kumar
    Jose, Babita Roslind
    Mathew, Jimson
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,