DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model

被引:7
|
作者
Choi, Jeongjun [1 ,2 ]
Shim, Dongseok [1 ]
Kim, H. Jin [1 ,2 ]
机构
[1] Seoul Natl Univ, Artificial Intelligence Inst AIIS, Seoul, South Korea
[2] Automat & Syst Res Inst ASRI, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/IROS55552.2023.10342204
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many real-world applications where frame sequences are not accessible. This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection. Rather than exploiting temporal information, we alleviate the depth ambiguity by generating multiple 3D pose candidates which can be mapped to an identical 2D keypoint. We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector. By considering the correlation between human joints by replacing the conventional denoising U-Net with graph convolutional network, our approach accomplishes further performance improvements. We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets. Comprehensive experiments are conducted to prove the efficacy of the proposed method, and they confirm that our model outperforms state-of-the-art multi-hypothesis 3D HPE methods.
引用
收藏
页码:3773 / 3780
页数:8
相关论文
共 50 条
  • [21] Staged cascaded network for monocular 3D human pose estimation
    Gao, Bing-kun
    Zhang, Zhong-xin
    Wu, Cui-na
    Wu, Chen-lei
    Bi, Hong-bo
    APPLIED INTELLIGENCE, 2023, 53 (01) : 1021 - 1029
  • [22] Diff3DHPE: A Diffusion Model for 3D Human Pose Estimation
    Zhou, Jieming
    Zhang, Tong
    Hayder, Zeeshan
    Petersson, Lars
    Harandi, Mehrtash
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 2084 - 2094
  • [23] DR-Net: denoising and reconstruction network for 3D human pose estimation from monocular RGB videos
    Chang, J. Y.
    ELECTRONICS LETTERS, 2018, 54 (02) : 70 - 72
  • [24] Monocular 3D Pose Estimation and Tracking by Detection
    Andriluka, Mykhaylo
    Roth, Stefan
    Schiele, Bernt
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 623 - 630
  • [25] Limb Pose Aware Networks for Monocular 3D Pose Estimation
    Wu, Lele
    Yu, Zhenbo
    Liu, Yijiang
    Liu, Qingshan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 906 - 917
  • [26] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
    Xu, Xiangyu
    Liu, Lijuan
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
  • [27] Evaluation of Human Pose Estimation in 3D with Monocular Camera for Clinical Application
    Carrasco-Plaza, Jose
    Cerda, Mauricio
    INTELLIGENT COMPUTING SYSTEMS (ISICS 2022), 2022, 1569 : 121 - 134
  • [28] Temporal Representation Learning on Monocular Videos for 3D Human Pose Estimation
    Honari, Sina
    Constantin, Victor
    Rhodin, Helge
    Salzmann, Mathieu
    Fua, Pascal
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 6415 - 6427
  • [29] Personalized Graph Generation for Monocular 3D Human Pose and Shape Estimation
    Hu, Junxing
    Zhang, Hongwen
    Wang, Yunlong
    Ren, Min
    Sun, Zhenan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2399 - 2413
  • [30] Boosting Monocular 3D Human Pose Estimation With Part Aware Attention
    Xue, Youze
    Chen, Jiansheng
    Gu, Xiangming
    Ma, Huimin
    Ma, Hongbing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4278 - 4291