PoseDiffusion: A Coarse-to-Fine Framework for Unseen Object 6-DoF Pose Estimation

被引:2
|
作者
Zhou, Jiaming [1 ,2 ]
Zhu, Qing [1 ,2 ]
Wang, Yaonan [1 ,2 ]
Feng, Mingtao [3 ]
Wu, Chengzhong [4 ]
Liu, Xuebing [1 ,2 ]
Huang, Jianan [1 ,2 ]
Mian, Ajmal [5 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410012, Peoples R China
[2] Natl Engn Res Ctr Robot Visual Percept & Control, Changsha 410082, Peoples R China
[3] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
[4] Jiangxi Prov Commun Terminal Ind Co Ltd, Jian 343000, Peoples R China
[5] Univ Western Australia, Dept Comp Sci & Software Engn, Perth, WA 6009, Australia
关键词
Diffusion model; robotic grasping; transformer; unseen object pose estimation;
D O I
10.1109/TII.2024.3399886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurately estimating the six-degrees of freedom (DoF) pose of unseen objects is crucial for successful robotic manipulation in industrial automation. Some existing methods for this task rely on prior knowledge of individual objects, i.e., the model must be trained on the exact object instance or object category. Others perform unseen object pose estimation but are limited in their feature learning and pose refinement ability. To address these problems, we propose an unseen object pose estimation method that follows a coarse-to-fine framework and leverages the powerful learning ability of diffusion models. We introduce a diffusion model for generating object poses, and conduct a comparison between the generated poses and the original pose to determine the optimal one. We design a novel pose estimation module to provide coarse poses for the PoseDiffusion. This module comprises two feature extraction modules that extract global and masked features. In addition, we propose a strategy to estimate the pose by comparing the similarity between rendered and query poses. The renderings of an unseen object from various viewpoints are generated from its computer-aided design (CAD) model. Our method requires a CAD model of the unseen object only during inference, a scenario well suited to industrial applications. Experimental evaluation on benchmark datasets demonstrates that the proposed framework outperforms existing approaches, achieving state-of-the-art performance in six-DoF object pose estimation.
引用
收藏
页码:11127 / 11138
页数:12
相关论文
共 50 条
  • [31] NEMA: 6-DoF Pose Estimation Dataset for Deep Learning
    Roman, Philippe Perez de San
    Desbarats, Pascal
    Domenger, Jean-Philippe
    Buendia, Axel
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 682 - 690
  • [32] Robust 6-DoF Pose Estimation under Hybrid Constraints
    Ren, Hong
    Lin, Lin
    Wang, Yanjie
    Dong, Xin
    SENSORS, 2022, 22 (22)
  • [33] APE: A MORE PRACTICAL APPROACH TO 6-DOF POSE ESTIMATION
    Gabas, Antonio
    Yoshiyasu, Yusuke
    Singh, Rohan Pratap
    Sagawa, Ryusuke
    Yoshida, Eiichi
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 3164 - 3168
  • [34] Multi-Modal Pose Representations for 6-DOF Object Tracking
    Majcher, Mateusz
    Kwolek, Bogdan
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (04)
  • [35] A deep Coarse-to-Fine network for head pose estimation from synthetic data
    Wang, Yujia
    Liang, Wei
    Shen, Jianbing
    Jia, Yunde
    Yu, Lap-Fai
    PATTERN RECOGNITION, 2019, 94 : 196 - 206
  • [36] 6-DOF Pose Estimation of a Portable Navigation Aid for the Visually Impaired
    Tamjidi, Amirhossein
    Ye, Cang
    Hong, Soonhac
    2013 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTIC AND SENSORS ENVIRONMENTS (ROSE 2013), 2013,
  • [37] Estimation of monocular vision 6-DOF pose based on CAD model
    Song W.
    Zhou Y.
    Song, Wei (song_wei@shu.edu.cn), 2016, Chinese Academy of Sciences (24): : 882 - 891
  • [38] Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation
    Li, Yiming
    Kong, Tao
    Chu, Ruihang
    Li, Yifeng
    Wang, Peng
    Li, Lei
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3571 - 3578
  • [39] A Multiscale Coarse-to-Fine Human Pose Estimation Network With Hard Keypoint Mining
    Jiang, Xiaoyan
    Tao, Hangyu
    Hwang, Jenq-Neng
    Fang, Zhijun
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (03): : 1730 - 1741
  • [40] Robust Ego and Object 6-DoF Motion Estimation and Tracking
    Zhang, Jun
    Henein, Mina
    Mahony, Robert
    Ila, Viorela
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5017 - 5023