Language-Guided Category Push-Grasp Synergy Learning in Clutter by Efficiently Perceiving Object Manipulation Space

被引:0
|
作者
Zhao, Min [1 ,2 ]
Zuo, Guoyu [1 ,2 ]
Yu, Shuangyue [1 ,2 ]
Luo, Yongkang [3 ]
Liu, Chunfang [1 ,2 ]
Gong, Daoxiong [1 ,2 ]
机构
[1] Beijing Univ Sci & Technol, Sch Informat Engn, Beijing, Peoples R China
[2] Beijing Key Lab Comp Intelligence & Intelligent Sy, Beijing 100124, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Robots; Grasping; Semantic segmentation; Cognition; Image color analysis; Annotations; Accuracy; Feature extraction; Collision avoidance; Training; Category push-grasp synergy; cluttered scene; language-guided; object manipulation space;
D O I
10.1109/TII.2024.3488774
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In flexible manufacturing, robots need to swiftly adapt to constantly changing production tasks. However, it remains a challenging problem for robots to grasp objects of specific categories through language instructions to complete production tasks in cluttered scenes. To address this issue, this article proposes a language-guided category push-grasp synergy network following a cognitive-decision framework. First, inspired by how humans can understand the world through interactions with the environment, we propose an environment state difference embodied self-supervision method that enables robots to autonomously collect embodied multimodal data and generate ground truths that eliminate annotation errors for cognition network training. Second, we develop a language-guided embodied multimodal object cognition network that fuses color and depth image information, enhancing the object cognition ability of robots in cluttered scenes and enabling dynamic semantic segmentation based on language commands. Finally, we propose an object manipulation space metric to measure the manipulable space of target objects, linking the reward function with metric changes before and after actions, thereby enhancing the system's perception of the manipulation space and improving operational performance. Experiments conducted in both simulated and real-world environments demonstrate that our proposed method outperforms existing state-of-the-art methods and can be generalized for grasping novel objects.
引用
收藏
页码:1783 / 1792
页数:10
相关论文
共 1 条
  • [1] Language-Guided Dexterous Functional Grasping by LLM Generated Grasp Functionality and Synergy for Humanoid Manipulation
    Li, Zhuo
    Liu, Junjia
    Li, Zhihao
    Dong, Zhipeng
    Teng, Tao
    Ou, Yongsheng
    Caldwell, Darwin
    Chen, Fei
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025,