Enhancing object pose estimation for RGB images in cluttered scenes

被引:0
|
作者
Al-Selwi, Metwalli [1 ,2 ,3 ,4 ]
Ning, Huang [3 ]
Gao, Yin [1 ,3 ,4 ]
Chao, Yan [3 ]
Li, Qiming [3 ]
Li, Jun [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Fujian Inst Res Struct Matter, Fuzhou, Fujian, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Chinese Acad Sci, Quanzhou Inst Equipment Mfg, Haixi Inst, Quanzhou, Fujian, Peoples R China
[4] Univ Chinese Acad Sci, Fujian Coll, Fuzhou, Fujian, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
基金
中国国家自然科学基金;
关键词
6D object pose estimation; Heavy occlusion; Cluttered scenes; Convolutional neural networks; Self-attention mechanisms;
D O I
10.1038/s41598-025-90482-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Estimating the 6D pose of objects is crucial for robots to interact with the environment. 6D Object pose estimation from RGB images in a cluttered scene and heavy occlusions is a critical issue. Most existing methods use two stages to estimate object pose: First, extract the object features, and then use the PnP/RANSAC method to estimate object pose. However, most of these techniques merely localize a group of key-points by regressing their coordinates, which are vulnerable to occlusion and have poor performance for multi-object pose estimation. These methods cannot directly regress the 6D pose estimation from a loss during training. In this paper, we propose a framework based on convolutional neural network (CNN) and self-attention mechanism as an end-to-end method for single and multi-object 6D pose estimation using RGB images with low computational cost. Our method utilizes feature fusion to extract local features and combines multi-head self-attention (MHSA) with iterative refinement to improve pose estimation performance. Furthermore, our method can be scaled according to computational resources. Our experiments illustrate that our method performs in benchmark datasets the Linemod and Occlusion Linemod and achieves 97.45% and 84.84% in terms of the ADD(-S) metric in both datasets, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] 6D Object Pose Estimation in Cluttered Scenes from RGB Images
    Xiao-Long Yang
    Xiao-Hong Jia
    Yuan Liang
    Lu-Bin Fan
    Journal of Computer Science and Technology, 2022, 37 : 719 - 730
  • [2] 6D Object Pose Estimation in Cluttered Scenes from RGB Images
    Yang, Xiao-Long
    Jia, Xiao-Hong
    Liang, Yuan
    Fan, Lu-Bin
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (03) : 719 - 730
  • [3] Enhancing object pose estimation for RGB images in cluttered scenesEnhancing object pose estimation for RGB images in cluttered scenesM. Al-Selwi et al.
    Metwalli Al-Selwi
    Huang Ning
    Yin Gao
    Yan Chao
    Qiming Li
    Jun Li
    Scientific Reports, 15 (1)
  • [4] Graspability-Aware Object Pose Estimation in Cluttered Scenes
    Hoang, Dinh-Cuong
    Nguyen, Anh-Nhat
    Vu, Van-Duc
    Nguyen, Thu-Uyen
    Vu, Duy-Quang
    Ngo, Phuc-Quan
    Hoang, Ngoc-Anh
    Phan, Khanh-Toan
    Tran, Duc-Thanh
    Nguyen, Van-Thiep
    Duong, Quang-Tri
    Ho, Ngoc-Trung
    Tran, Cong-Trinh
    Duong, Van-Hiep
    Mai, Anh-Truong
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3124 - 3130
  • [5] Hierarchical Semantic Parsing for Object Pose Estimation in Densely Cluttered Scenes
    Li, Chi
    Bohren, Jonathan
    Carlson, Eric
    Hager, Gregory D.
    2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 5068 - 5075
  • [6] Accurate 6D Object Pose Estimation and Refinement in Cluttered Scenes
    Jin, Yixiang
    Rossiter, John Anthony
    Veres, Sandor M.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ROBOTICS, COMPUTER VISION AND INTELLIGENT SYSTEMS (ROBOVIS), 2021, : 31 - 39
  • [7] PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes
    Xiang, Yu
    Schmidt, Tanner
    Narayanan, Venkatraman
    Fox, Dieter
    ROBOTICS: SCIENCE AND SYSTEMS XIV, 2018,
  • [8] Robust 6D Object Pose Estimation in Cluttered Scenes using Semantic Segmentation and Pose Regression Networks
    Periyasamy, Arul Selvam
    Schwarz, Max
    Behnke, Sven
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6660 - 6666
  • [9] Learning latent geometric consistency for 6D object pose estimation in heavily cluttered scenes
    Li, Qingnan
    Hu, Ruimin
    Xiao, Jing
    Wang, Zhongyuan
    Chen, Yu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2020, 70
  • [10] Perception Subsystem for Object Recognition and Pose Estimation in RGB-D Images
    Kornuta, Tomasz
    Laszkowski, Michal
    CHALLENGES IN AUTOMATION, ROBOTICS AND MEASUREMENT TECHNIQUES, 2016, 440 : 597 - 607