Improving 3D Human Pose Estimation via 3D Part Affinity Fields

被引:9
|
作者
Liu, Ding [1 ]
Zhao, Zixu [1 ]
Wang, Xinchao [2 ]
Hu, Yuxiao [3 ]
Zhang, Lei [4 ]
Huang, Thomas S. [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Stevens Inst Technol, Hoboken, NJ 07030 USA
[3] Huawei Technol Inc USA, Santa Clara, CA USA
[4] Microsoft, Bellevue, WA USA
来源
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2019年
关键词
REPRESENTATION;
D O I
10.1109/WACV.2019.00112
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
3D human pose estimation from monocular images has become a heated area in computer vision recently. For years, most deep neural network based practices have adopted either an end-to-end approach, or a two-stage approach. An end-to-end network typically estimates 3D human poses directly from 2D input images, but it suffers from the shortage of 3D human pose data. It is also obscure to know if the inaccuracy stems from limited visual understanding or 2D-to-3D mapping. Whereas a two-stage directly lifts those 2D keypoint outputs to the 3D space, after utilizing an existing network for 2D keypoint detections. However, they tend to ignore some useful contextual hints from the 2D raw image pixels. In this paper, we introduce a two-stage architecture that can eliminate the main disadvantages of both these approaches. During the first stage we use an existing stateof- the-art detector to estimate 2D poses. To add more contextual information to help lifting 2D poses to 3D poses, we propose 3D Part Affinity Fields (3D-PAFs). We use 3D-PAFs to infer 3D limb vectors, and combine them with 2D poses to regress the 3D coordinates. We trained and tested our proposed framework on Human3.6M, the most popular 3D human pose benchmark dataset. Our approach achieves the state-of-the-art performance, which proves that with right selections of contextual information, a simple regression model can be very powerful in estimating 3D poses.
引用
收藏
页码:1004 / 1013
页数:10
相关论文
共 50 条
  • [11] Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation
    Shan, Wenkang
    Lu, Haopeng
    Wang, Shanshe
    Zhang, Xinfeng
    Gao, Wen
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3446 - 3454
  • [12] 2D-3D pose consistency-based conditional random fields for 3D human pose estimation
    Chang, Ju Yong
    Lee, Kyoung Mu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 169 : 52 - 61
  • [13] Boosting Monocular 3D Human Pose Estimation With Part Aware Attention
    Xue, Youze
    Chen, Jiansheng
    Gu, Xiangming
    Ma, Huimin
    Ma, Hongbing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4278 - 4291
  • [14] Occlusion Resilient 3D Human Pose Estimation
    Roy, Soumava Kumar
    Badanin, Ilia
    Honari, Sina
    Fua, Pascal
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1198 - 1207
  • [15] A survey on monocular 3D human pose estimation
    Ji X.
    Fang Q.
    Dong J.
    Shuai Q.
    Jiang W.
    Zhou X.
    Virtual Reality and Intelligent Hardware, 2020, 2 (06): : 471 - 500
  • [16] Precise 3D Pose Estimation of Human Faces
    Pernek, Akos
    Hajder, Levente
    PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 3, 2014, : 618 - 625
  • [17] A survey on deep 3D human pose estimation
    Neupane, Rama Bastola
    Li, Kan
    Boka, Tesfaye Fenta
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 58 (01)
  • [18] Deep 3D human pose estimation: A review
    Wang, Jinbao
    Tan, Shujie
    Zhen, Xiantong
    Xu, Shuo
    Zheng, Feng
    He, Zhenyu
    Shao, Ling
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 210
  • [19] 3D Human Pose Estimation With Adversarial Learning
    Meng, Wenming
    Hu, Tao
    Shuai, Li
    2019 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV), 2019, : 93 - 99
  • [20] View Invariant 3D Human Pose Estimation
    Wei, Guoqiang
    Lan, Cuiling
    Zeng, Wenjun
    Chen, Zhibo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4601 - 4610