RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation

被引：0

作者：

An, Boshi ^{[1
,2
]}

Geng, Yiran ^{[1
,2
]}

Chen, Kai ^{[4
]}

Li, Xiaoqi ^{[1
,2
,3
]}

Dou, Qi ^{[4
]}

Dong, Hao ^{[1
,2
]}

机构：

[1] Peking Univ, Sch CS, Hyperplane Lab, Beijing, Peoples R China

[2] Natl Key Lab Multimedia Informat Proc, Beijing, Peoples R China

[3] Beijing Acad Artificial Intelligence BAAI, Beijing, Peoples R China

[4] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICRA57147.2024.10610690

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robotic manipulation requires accurate perception of the environment, which poses a significant challenge due to its inherent complexity and constantly changing nature. In this context, RGB image and point-cloud observations are two commonly used modalities in visual-based robotic manipulation, but each of these modalities have their own limitations. Commercial point-cloud observations often suffer from issues like sparse sampling and noisy output due to the limits of the emission-reception imaging principle. On the other hand, RGB images, while rich in texture information, lack essential depth and 3D information crucial for robotic manipulation. To mitigate these challenges, we propose an image-only robotic manipulation framework that leverages an eye-on-hand monocular camera installed on the robot's parallel gripper. By moving with the robot gripper, this camera gains the ability to actively perceive the object from multiple perspectives during the manipulation process. This enables the estimation of 6D object poses, which can be utilized for manipulation. While, obtaining images from more and diverse viewpoints typically improves pose estimation, it also increases the manipulation time. To address this trade-off, we employ a reinforcement learning policy to synchronize the manipulation strategy with active perception, achieving a balance between 6D pose accuracy and manipulation efficiency. Our experimental results in both simulated and real-world environments showcase the state-of-the-art effectiveness of our approach. We believe that our method will inspire further research on real-world-oriented robotic manipulation. See https://rgbmanip.github.io/for more details.

引用

页码：7748 / 7755

页数：8

共 50 条

[41] Monocular-Based Pose Estimation Based on Fiducial Markers for Space Robotic Capture Operations in GEO
Opromolla, Roberto
Vela, Claudio
Nocerino, Alessia
Lombardi, Carlo
REMOTE SENSING, 2022, 14 (18)
[42] Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image
Fan, Zhaoxin
Song, Zhenbo
Xu, Jian
Wang, Zhicheng
Wu, Kejian
Liu, Hongyan
He, Jun
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 220 - 236
[43] A Unified Framework for Depth-Assisted Monocular Object Pose Estimation
Hoang, Dinh-Cuong
Tan, Phan Xuan
Nguyen, Thu-Uyen
Pham, Hai-Nam
Nguyen, Chi-Minh
Bui, Son-Anh
Duong, Quang-Tri
Vu, van-Duc
Nguyen, van-Thiep
Duong, van-Hiep
Hoang, Ngoc-Anh
Phan, Khanh-Toan
Tran, Duc-Thanh
Ho, Ngoc-Trung
Tran, Cong-Trinh
IEEE ACCESS, 2024, 12 : 111723 - 111740
[44] 3-D head pose estimation for monocular image
Pan, YJ
Zhu, H
Ji, RR
FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 293 - 301
[45] Challenges for Monocular 6-D Object Pose Estimation in Robotics
Thalhammer, Stefan
Bauer, Dominik
Hoenig, Peter
Weibel, Jean-Baptiste
Garcia-Rodriguez, Jose
Vincze, Markus
IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 4065 - 4084
[46] A survey on joint object detection and pose estimation using monocular vision
Patil, Aniruddha V.
Rabha, Pankaj
2018 INTERNATIONAL JOINT CONFERENCE ON METALLURGICAL AND MATERIALS ENGINEERING (JCMME 2018), 2019, 277
[47] Image-Based Pose Estimation Using a Compact 3D Model
Heisterklaus, Iris
Qian, Ningqing
Miller, Artur
2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS BERLIN (ICCE-BERLIN), 2014, : 327 - 330
[48] Image-based object editing
Rushmeier, H
Gomes, J
Balmelli, L
Bernardini, F
Taubin, G
FOURTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2003, : 20 - 27
[49] RGB-D Image-based Pose Estimation with Monte Carlo Localization
Li, Ming
Qin, Hao
Huang, May
Cao, Jian
Zhang, Xing
2017 3RD INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2017, : 109 - 114
[50] Image-Based Tactile Deformation Simulation and Pose Estimation for Robot Skill Learning
Fu, Chenfeng
Li, Longnan
Gao, Yuan
Wan, Weiwei
Harada, Kensuke
Lu, Zhenyu
Yang, Chenguang
APPLIED SCIENCES-BASEL, 2025, 15 (03):

← 1 2 3 4 5 →