Enhancing object pose estimation for RGB images in cluttered scenes

被引:0
|
作者
Al-Selwi, Metwalli [1 ,2 ,3 ,4 ]
Ning, Huang [3 ]
Gao, Yin [1 ,3 ,4 ]
Chao, Yan [3 ]
Li, Qiming [3 ]
Li, Jun [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Fujian Inst Res Struct Matter, Fuzhou, Fujian, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Quanzhou Inst Equipment Mfg, Haixi Inst, Quanzhou, Fujian, Peoples R China
[4] Univ Chinese Acad Sci, Fujian Coll, Fuzhou, Fujian, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
基金
中国国家自然科学基金;
关键词
6D object pose estimation; Heavy occlusion; Cluttered scenes; Convolutional neural networks; Self-attention mechanisms;
D O I
10.1038/s41598-025-90482-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Estimating the 6D pose of objects is crucial for robots to interact with the environment. 6D Object pose estimation from RGB images in a cluttered scene and heavy occlusions is a critical issue. Most existing methods use two stages to estimate object pose: First, extract the object features, and then use the PnP/RANSAC method to estimate object pose. However, most of these techniques merely localize a group of key-points by regressing their coordinates, which are vulnerable to occlusion and have poor performance for multi-object pose estimation. These methods cannot directly regress the 6D pose estimation from a loss during training. In this paper, we propose a framework based on convolutional neural network (CNN) and self-attention mechanism as an end-to-end method for single and multi-object 6D pose estimation using RGB images with low computational cost. Our method utilizes feature fusion to extract local features and combines multi-head self-attention (MHSA) with iterative refinement to improve pose estimation performance. Furthermore, our method can be scaled according to computational resources. Our experiments illustrate that our method performs in benchmark datasets the Linemod and Occlusion Linemod and achieves 97.45% and 84.84% in terms of the ADD(-S) metric in both datasets, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images
    Tsai, Chi-Yi
    Tsai, Shu-Hsiang
    IEEE ACCESS, 2018, 6 : 28859 - 28869
  • [22] Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images
    Okada, Ryuzo
    Soatto, Stefano
    COMPUTER VISION - ECCV 2008, PT II, PROCEEDINGS, 2008, 5303 : 434 - 445
  • [23] 6D Hybrid Pose Estimation in Cluttered Industrial Scenes for Robotic Grasping
    Peng, Yueyan
    Yang, Xuyun
    Wei, Sheng
    Gao, Xiang
    Li, Wei
    Wen, James Zhiging
    2022 INTERNATIONAL CONFERENCE ON INDUSTRIAL AUTOMATION, ROBOTICS AND CONTROL ENGINEERING, IARCE, 2022, : 19 - 23
  • [24] A 3D Object Detection and Pose Estimation Pipeline Using RGB-D Images
    He, Ruotao
    Rojas, Juan
    Guan, Yisheng
    2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 1527 - 1532
  • [25] Object pose estimation in underwater acoustic images
    Murino, V
    Foresti, GL
    Trucco, A
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL I, 1997, : 873 - 876
  • [26] Object Recognition and 3D Pose Estimation Using Improved VGG16 Deep Neural Network in Cluttered Scenes
    He, Shengzhan
    Liang, Guoyuan
    Chen, Fan
    Wu, Xinyu
    Feng, Wei
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING 2018 (ICITEE '18), 2018,
  • [27] Robust Robot Pose Estimation for Challenging Scenes With an RGB-D Camera
    Yu, Hongshan
    Fu, Qiang
    Yang, Zhengeng
    Tan, Lei
    Sun, Wei
    Sun, Mingui
    IEEE SENSORS JOURNAL, 2019, 19 (06) : 2217 - 2229
  • [28] Hourglass Network for Hand Pose Estimation ith RGB Images
    Wang, Qizhi
    Yang, Yonggang
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1342 - 1347
  • [29] GPR: Grasp Pose Refinement Network for Cluttered Scenes
    Wei, Wei
    Luo, Yongkang
    Li, Fuyu
    Xu, Guangyun
    Zhong, Jun
    Li, Wanyi
    Wang, Peng
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4295 - 4302
  • [30] Texture-less object detection and 6D pose estimation in RGB-D images
    Zhang, Haoruo
    Cao, Qixin
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 95 : 64 - 79