RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation

被引：0

作者：

An, Boshi ^{[1
,2
]}

Geng, Yiran ^{[1
,2
]}

Chen, Kai ^{[4
]}

Li, Xiaoqi ^{[1
,2
,3
]}

Dou, Qi ^{[4
]}

Dong, Hao ^{[1
,2
]}

机构：

[1] Peking Univ, Sch CS, Hyperplane Lab, Beijing, Peoples R China

[2] Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China

[3] Beijing Acad Artificial Intelligence BAAI, Beijing, Peoples R China

[4] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICRA57147.2024.10610690

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robotic manipulation requires accurate perception of the environment, which poses a significant challenge due to its inherent complexity and constantly changing nature. In this context, RGB image and point-cloud observations are two commonly used modalities in visual-based robotic manipulation, but each of these modalities have their own limitations. Commercial point-cloud observations often suffer from issues like sparse sampling and noisy output due to the limits of the emission-reception imaging principle. On the other hand, RGB images, while rich in texture information, lack essential depth and 3D information crucial for robotic manipulation. To mitigate these challenges, we propose an image-only robotic manipulation framework that leverages an eye-on-hand monocular camera installed on the robot's parallel gripper. By moving with the robot gripper, this camera gains the ability to actively perceive the object from multiple perspectives during the manipulation process. This enables the estimation of 6D object poses, which can be utilized for manipulation. While, obtaining images from more and diverse viewpoints typically improves pose estimation, it also increases the manipulation time. To address this trade-off, we employ a reinforcement learning policy to synchronize the manipulation strategy with active perception, achieving a balance between 6D pose accuracy and manipulation efficiency. Our experimental results in both simulated and real-world environments showcase the state-of-the-art effectiveness of our approach. We believe that our method will inspire further research on real-world-oriented robotic manipulation. See https://rgbmanip.github.io/for more details.

引用

页码：7748 / 7755

页数：8

共 50 条

[21] Image-based aircraft pose estimation using moment invariants
Breuers, MG
AUTOMATIC TARGET RECOGNITION IX, 1999, 3718 : 294 - 304
[22] A critical analysis of image-based camera pose estimation techniques
Xu, Meng
Wang, Youchen
Xu, Bin
Zhang, Jun
Ren, Jian
Huang, Zhao
Poslad, Stefan
Xu, Pengfei
NEUROCOMPUTING, 2024, 570
[23] Brain Mechanisms for Robotic Object Pose Estimation
Chinellato, Eris
Grzyb, Beata J.
del Pobil, Angel P.
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3268 - 3275
[24] A Simple Image-based Object Velocity Estimation Approach
Chu, Hung-Chi
Yang, Hao
2014 IEEE 11TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2014, : 102 - 107
[25] Object Detection and 6D Pose Estimation for Precise Robotic Manipulation in Unstructured Environments
di Castro, Mario
Camarero Vera, Jorge
Ferre, Manuel
Masi, Alessandro
INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, ICINCO 2017, 2020, 495 : 392 - 403
[26] Object Pose Estimation from Monocular Image Using Multi-view Keypoint Correspondence
Kundu, Jogendra Nath
Rahul, M., V
Ganeshan, Aditya
Babu, R. Venkatesh
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 298 - 313
[27] Image-based UAV position and velocity estimation using a monocular camera
Nabavi-Chashmi, Seyed-Yaser
Asadi, Davood
Ahmadi, Karim
CONTROL ENGINEERING PRACTICE, 2023, 134
[28] Monocular Image-based Intruder Direction Estimation at Closest Point of Approach
Bauer, Peter
Hiba, Antal
Bokor, Jozsef
2017 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS (ICUAS'17), 2017, : 1108 - 1117
[29] Monocular Image-based Time to Collision and Closest Point of Approach Estimation
Bauer, Peter
Hiba, Antal
Vanek, Balint
Zarandy, Akos
Bokor, Jozsef
2016 24TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2016, : 1168 - 1173
[30] HUMAN POSE ESTIMATION FROM MONOCULAR IMAGE CAPTURES
Lin, Huei-Yung
Chen, Ting-Wen
Chen, Chih-Chang
Hsieh, Chia-Hao
Lie, Wen-Nung
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 994 - +

← 1 2 3 4 5 →