Efficient Bimanual Handover and Rearrangement via Symmetry-Aware Actor-Critic Learning

被引:3
|
作者
Li, Yunfei [1 ]
Pan, Chaoyi [2 ]
Xu, Huazhe [1 ,4 ,5 ]
Wang, Xiaolong [3 ]
Wu, Yi [1 ,5 ]
机构
[1] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[3] Univ Calif San Diego, Dept Elect & Comp Engn, San Diego, CA USA
[4] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[5] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
关键词
D O I
10.1109/ICRA48891.2023.10160739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bimanual manipulation is important for building intelligent robots that unlock richer skills than single arms. We consider a multi-object bimanual rearrangement task, where a reinforcement learning (RL) agent aims to jointly control two arms to rearrange these objects as fast as possible. Solving this task efficiently is challenging for an RL agent due to the requirement of discovering precise intra-arm coordination in an exponentially large control space. We develop a symmetry-aware actor-critic framework that leverages the interchangeable roles of the two manipulators in the bimanual control setting to reduce the policy search space. To handle the compositionality over multiple objects, we augment training data with an object-centric relabeling technique. The overall approach produces an RL policy that can rearrange up to 8 objects with a success rate of over 70% in simulation. We deploy the policy to two Franka Panda arms and further show a successful demo on human-robot collaboration. Videos can be found at https: //sites.google.com/view/bimanual.
引用
收藏
页码:3867 / 3874
页数:8
相关论文
共 50 条
  • [21] Reinforcement actor-critic learning as a rehearsal in MicroRTS
    Manandhar, Shiron
    Banerjee, Bikramjit
    KNOWLEDGE ENGINEERING REVIEW, 2024, 39
  • [22] Actor-Critic Learning Based QoS-Aware Scheduler for Reconfigurable Wireless Networks
    Mollahasani, Shahram
    Erol-Kantarci, Melike
    Hirab, Mahdi
    Dehghan, Hoda
    Wilson, Rodney
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 45 - 54
  • [23] An efficient and lightweight off-policy actor-critic reinforcement learning framework
    Zhang, Huaqing
    Ma, Hongbin
    Zhang, Xiaofei
    Mersha, Bemnet Wondimagegnehu
    Wang, Li
    Jin, Ying
    APPLIED SOFT COMPUTING, 2024, 163
  • [24] IMPROVING ACTOR-CRITIC REINFORCEMENT LEARNING VIA HAMILTONIAN MONTE CARLO METHOD
    Xu, Duo
    Fekri, Faramarz
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4018 - 4022
  • [25] CONTROLLED SENSING AND ANOMALY DETECTION VIA SOFT ACTOR-CRITIC REINFORCEMENT LEARNING
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4198 - 4202
  • [26] Optimal Elevator Group Control via Deep Asynchronous Actor-Critic Learning
    Wei, Qinglai
    Wang, Lingxiao
    Liu, Yu
    Polycarpou, Marios M.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5245 - 5256
  • [27] Variational value learning in advantage actor-critic reinforcement learning
    Zhang, Yaozhong
    Han, Jiaqi
    Hu, Xiaofang
    Dan, Shihao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1955 - 1960
  • [28] Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method
    Xu D.
    Fekri F.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1642 - 1653
  • [29] Efficient data use in incremental actor-critic algorithms
    Cheng, Yuhu
    Feng, Huanting
    Wang, Xuesong
    NEUROCOMPUTING, 2013, 116 : 346 - 354
  • [30] Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
    Han, Seungyul
    Sung, Youngchul
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139