LAGOON: Language-Guided Motion Control

被引:0
|
作者
Xu, Shusheng [1 ,2 ]
Wang, Huaijie [1 ,2 ]
Ouyang, Yutao [2 ,3 ]
Gao, Jiaxuan [1 ,2 ]
Meng, Zhiyu [1 ,2 ]
Yu, Chao [1 ]
Wu, Yi [1 ,2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
[3] Xiamen Univ, Xiamen, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024) | 2024年
关键词
D O I
10.1109/ICRA57147.2024.10610467
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We aim to control a robot to physically behave in the real world following any high-level language command like "cartwheel" or "kick". Although human motion datasets exist, this task remains particularly challenging since generative models can produce physically unrealistic motions, which will be more severe for robots due to different body structures and physical properties. Deploying such a motion to a physical robot can cause even greater difficulties due to the sim2real gap. We develop LAnguage-Guided mOtion cONtrol (LAGOON), a multi-phase reinforcement learning (RL) method to generate physically realistic robot motions under language commands. LAGOON first leverages a pretrained model to generate a human motion from a language command. Then an RL phase trains a control policy in simulation to mimic the generated human motion. Finally, with domain randomization, our learned policy can be deployed to a quadrupedal robot, leading to a quadrupedal robot that can take diverse behaviors in the real world under natural language commands.
引用
收藏
页码:9743 / 9750
页数:8
相关论文
共 50 条
  • [21] Towards Language-Guided Visual Recognition via Dynamic Convolutions
    Gen Luo
    Yiyi Zhou
    Xiaoshuai Sun
    Yongjian Wu
    Yue Gao
    Rongrong Ji
    International Journal of Computer Vision, 2024, 132 : 1 - 19
  • [22] Language-Guided Semantic Clustering for Remote Sensing Change Detection
    Hu, Shenglong
    Bian, Yiting
    Chen, Bin
    Song, Huihui
    Zhang, Kaihua
    SENSORS, 2024, 24 (24)
  • [23] Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
    Miao, Xingyu
    Duan, Haoran
    Bai, Yang
    Shah, Tejal
    Song, Jun
    Long, Yang
    Ranjan, Rajiv
    Shao, Ling
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (05) : 3922 - 3934
  • [24] Language-Guided Visual Aggregation Network for Video Question Answering
    Liang, Xiao
    Wang, Di
    Wang, Quan
    Wan, Bo
    An, Lingling
    He, Lihuo
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5195 - 5203
  • [25] Language-Guided Music Recommendation for Video via Prompt Analogies
    McKee, Daniel
    Salamon, Justin
    Sivic, Josef
    Russell, Bryan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14784 - 14793
  • [26] Towards a Watson That Sees: Language-Guided Action Recognition for Robots
    Teo, Ching L.
    Yang, Yezhou
    Daume, Hal, III
    Fermueller, Cornelia
    Aloimonos, Yiannis
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 374 - 381
  • [27] A Benchmark for UAV-View Natural Language-Guided Tracking
    Li, Hengyou
    Liu, Xinyan
    Li, Guorong
    ELECTRONICS, 2024, 13 (09)
  • [28] Language-Guided Transformer for Federated Multi-Label Classification
    Liu, I-Jieh
    Lin, Ci-Siang
    Yang, Fu-En
    Wang, Yu-Chiang Frank
    arXiv, 2023,
  • [29] LapsCore: Language-guided Person Search via Color Reasoning
    Wu, Yushuang
    Yan, Zizheng
    Han, Xiaoguang
    Li, Guanbin
    Zou, Changqing
    Cui, Shuguang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1604 - 1613
  • [30] Towards Language-Guided Visual Recognition via Dynamic Convolutions
    Luo, Gen
    Zhou, Yiyi
    Sun, Xiaoshuai
    Wu, Yongjian
    Gao, Yue
    Ji, Rongrong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (01) : 1 - 19