Enhancing object pose estimation for RGB images in cluttered scenes

被引:0
|
作者
Al-Selwi, Metwalli [1 ,2 ,3 ,4 ]
Ning, Huang [3 ]
Gao, Yin [1 ,3 ,4 ]
Chao, Yan [3 ]
Li, Qiming [3 ]
Li, Jun [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Fujian Inst Res Struct Matter, Fuzhou, Fujian, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Quanzhou Inst Equipment Mfg, Haixi Inst, Quanzhou, Fujian, Peoples R China
[4] Univ Chinese Acad Sci, Fujian Coll, Fuzhou, Fujian, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
基金
中国国家自然科学基金;
关键词
6D object pose estimation; Heavy occlusion; Cluttered scenes; Convolutional neural networks; Self-attention mechanisms;
D O I
10.1038/s41598-025-90482-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Estimating the 6D pose of objects is crucial for robots to interact with the environment. 6D Object pose estimation from RGB images in a cluttered scene and heavy occlusions is a critical issue. Most existing methods use two stages to estimate object pose: First, extract the object features, and then use the PnP/RANSAC method to estimate object pose. However, most of these techniques merely localize a group of key-points by regressing their coordinates, which are vulnerable to occlusion and have poor performance for multi-object pose estimation. These methods cannot directly regress the 6D pose estimation from a loss during training. In this paper, we propose a framework based on convolutional neural network (CNN) and self-attention mechanism as an end-to-end method for single and multi-object 6D pose estimation using RGB images with low computational cost. Our method utilizes feature fusion to extract local features and combines multi-head self-attention (MHSA) with iterative refinement to improve pose estimation performance. Furthermore, our method can be scaled according to computational resources. Our experiments illustrate that our method performs in benchmark datasets the Linemod and Occlusion Linemod and achieves 97.45% and 84.84% in terms of the ADD(-S) metric in both datasets, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing
    Basevi, Hector
    Leonardis, Ales
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 665 - 681
  • [42] A ROBOT POSE ESTIMATION APPROACH BASED ON OBJECT TRACKING IN MONITORING SCENES
    Yuan, Wenbo
    Cao, Zhiqiang
    Zhang, Yujia
    Tan, Min
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2017, 32 (03): : 256 - 265
  • [43] A robot pose estimation approach based on object tracking in monitoring scenes
    Yuan, Wenbo
    Cao, Zhiqiang
    Zhang, Yujia
    Tan, Min
    International Journal of Robotics and Automation, 2017, 32 (03): : 256 - 265
  • [44] SyDPose: Object Detection and Pose Estimation in Cluttered Real-World Depth Images Trained using only Synthetic Data
    Thalhammer, Stefan
    Patten, Timothy
    Vincze, Markus
    2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 106 - 115
  • [45] Dense Color Constraints based 6D object pose estimation from RGB-D images
    Wang, Zilun
    Liu, Yi
    Xu, Chi
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6416 - 6420
  • [46] Hybrid 6D Object Pose Estimation from the RGB Image
    Staszak, Rafal
    Belter, Dominik
    ICINCO: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 1, 2019, : 541 - 549
  • [47] Object Finding in Cluttered Scenes Using Interactive Perception
    Novkovic, Tonci
    Pautrat, Remi
    Furrer, Fadri
    Breyer, Michel
    Siegwart, Roland
    Nieto, Juan
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 8338 - 8344
  • [48] Object Segmentation of Indoor Scenes Using Perceptual Organization on RGB-D Images
    Wang, Chaonan
    Xue, Yanbing
    Zhang, Hua
    Xu, Guangping
    Gao, Zan
    2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP), 2016,
  • [49] Pose estimation of textureless objects in cluttered environments
    Bratanic, Blaz
    Likar, Bostjan
    Pernus, Franjo
    Tomazevic, Dejan
    2015 14TH IAPR INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA), 2015, : 134 - 137
  • [50] Object detection in cluttered infrared images
    Brunnström, K
    Schenkman, BN
    Jacobson, B
    OPTICAL ENGINEERING, 2003, 42 (02) : 388 - 399