Bi-Pose: Bidirectional 2D-3D Transformation for Human Pose Estimation From a Monocular Camera

被引:2
|
作者
Du, Songlin [1 ,2 ]
Wang, Hao [3 ]
Yuan, Zhiwei [1 ,2 ]
Ikenaga, Takeshi [3 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Southeast Univ, Shenzhen Res Inst, Shenzhen 518063, Peoples R China
[3] Waseda Univ, Grad Sch Informat Prod & Syst, Kitakyushu 8080135, Japan
基金
中国国家自然科学基金;
关键词
3D human pose estimation; human-centered automation systems; bidirectional 2D-3D transformation; image-assisted 3D offset prediction; bone-length stability; ALGORITHM; TRACKING; NETWORK;
D O I
10.1109/TASE.2023.3279928
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatically estimating 3D human poses in video and inferring their meanings play an essential role in many human-centered automation systems. Existing researches made remarkable progresses by first estimating 2D human joints in video and then reconstructing 3D human pose from the 2D joints. However, mono-directionally reconstructing 3D pose from 2D joints ignores the interaction between information in 3D space and 2D space, losses rich information of original video, therefore limits the ceiling of estimation accuracy. To this end, this paper proposes a bidirectional 2D-3D transformation framework that bidirectionally exchanges 2D and 3D information and utilizes video information to estimate an offset for refining 3D human pose. In addition, a bone-length stability loss is utilized for the purpose of exploring human body structure to make the estimated 3D pose more natural and to further increase the overall accuracy. By evaluation, estimation error of the proposed method, measured by the mean per joint position error (MPJPE), is only 46.5 mm, which is much lower than state-of-the-art methods under the same experimental condition. The improvement on accuracy will make machines to better understand human poses for building superior human-centered automation systems.
引用
收藏
页码:3483 / 3496
页数:14
相关论文
共 50 条
  • [41] Monocular 3D Human Pose Estimation by Predicting Depth on Joints
    Nie, Bruce Xiaohan
    Wei, Ping
    Zhu, Song-Chun
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3467 - 3475
  • [42] Locally Connected Network for Monocular 3D Human Pose Estimation
    Ci, Hai
    Ma, Xiaoxuan
    Wang, Chunyu
    Wang, Yizhou
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) : 1429 - 1442
  • [43] Staged cascaded network for monocular 3D human pose estimation
    Bing-kun Gao
    Zhong-xin Zhang
    Cui-na Wu
    Chen-lei Wu
    Hong-bo Bi
    Applied Intelligence, 2023, 53 : 1021 - 1029
  • [44] Staged cascaded network for monocular 3D human pose estimation
    Gao, Bing-kun
    Zhang, Zhong-xin
    Wu, Cui-na
    Wu, Chen-lei
    Bi, Hong-bo
    APPLIED INTELLIGENCE, 2023, 53 (01) : 1021 - 1029
  • [45] Decanus to Legatus: Synthetic Training for 2D-3D Human Pose Lifting
    Zhu, Yue
    Picard, David
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 257 - 274
  • [46] 3-D head pose estimation for monocular image
    Pan, YJ
    Zhu, H
    Ji, RR
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 293 - 301
  • [47] Recovering 3D human pose from monocular images
    Agarwal, A
    Triggs, B
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (01) : 44 - 58
  • [48] Multi-Person 3D Human Pose Estimation from Monocular Images
    Dabral, Rishabh
    Gundavarapu, Nitesh B.
    Mitra, Rahul
    Sharma, Abhishek
    Ramakrishnan, Ganesh
    Jain, Arjun
    2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 405 - 414
  • [49] Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video
    Zhou, Xiaowei
    Zhu, Menglong
    Leonardos, Spyridon
    Derpanis, Konstantinos G.
    Daniilidis, Kostas
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4966 - 4975
  • [50] Non-Rigid 2D-3D Pose Estimation and 2D Image Segmentation
    Sandhu, Romeil
    Dambreville, Samuel
    Yezzi, Anthony
    Tannenbaum, Allen
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 786 - 793