Bi-Pose: Bidirectional 2D-3D Transformation for Human Pose Estimation From a Monocular Camera

被引:2
|
作者
Du, Songlin [1 ,2 ]
Wang, Hao [3 ]
Yuan, Zhiwei [1 ,2 ]
Ikenaga, Takeshi [3 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Southeast Univ, Shenzhen Res Inst, Shenzhen 518063, Peoples R China
[3] Waseda Univ, Grad Sch Informat Prod & Syst, Kitakyushu 8080135, Japan
基金
中国国家自然科学基金;
关键词
3D human pose estimation; human-centered automation systems; bidirectional 2D-3D transformation; image-assisted 3D offset prediction; bone-length stability; ALGORITHM; TRACKING; NETWORK;
D O I
10.1109/TASE.2023.3279928
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatically estimating 3D human poses in video and inferring their meanings play an essential role in many human-centered automation systems. Existing researches made remarkable progresses by first estimating 2D human joints in video and then reconstructing 3D human pose from the 2D joints. However, mono-directionally reconstructing 3D pose from 2D joints ignores the interaction between information in 3D space and 2D space, losses rich information of original video, therefore limits the ceiling of estimation accuracy. To this end, this paper proposes a bidirectional 2D-3D transformation framework that bidirectionally exchanges 2D and 3D information and utilizes video information to estimate an offset for refining 3D human pose. In addition, a bone-length stability loss is utilized for the purpose of exploring human body structure to make the estimated 3D pose more natural and to further increase the overall accuracy. By evaluation, estimation error of the proposed method, measured by the mean per joint position error (MPJPE), is only 46.5 mm, which is much lower than state-of-the-art methods under the same experimental condition. The improvement on accuracy will make machines to better understand human poses for building superior human-centered automation systems.
引用
收藏
页码:3483 / 3496
页数:14
相关论文
共 50 条
  • [21] Limb Pose Aware Networks for Monocular 3D Pose Estimation
    Wu, Lele
    Yu, Zhenbo
    Liu, Yijiang
    Liu, Qingshan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 906 - 917
  • [22] 3D human pose estimation based on 2D-3D consistency with synchronized adversarial training
    Deng, Yicheng
    Sun, Cheng
    Sun, Yongqi
    Zhu, Jiahui
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 175
  • [23] Monocular Weakly-Supervised Camera-Relative 3D Human Pose Estimation
    Christidis, Anestis
    Papaioannidis, Christos
    Pitas, Ioannis
    2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [24] Fusion of Inertial Sensor Suit and Monocular Camera for 3D Human Pelvis Pose Estimation
    Popescu, Mihaela
    Shinde, Kashmira
    Sharma, Proneet
    Gutzeit, Lisa
    Kirchner, Frank
    2024 33RD IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, ROMAN 2024, 2024, : 160 - 167
  • [25] 3D Pose Estimation and Tracking in Handball Actions Using a Monocular Camera
    Sajina, Romeo
    Ivasic-Kos, Marina
    JOURNAL OF IMAGING, 2022, 8 (11)
  • [26] 3D Human Pose Estimation=2D Pose Estimation plus Matching
    Chen, Ching-Hang
    Ramanan, Deva
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5759 - 5767
  • [27] 3D driver pose estimation based on joint 2D-3D network
    Yao, Zhijie
    Liu, Yazhou
    Ji, Zexuan
    Sun, Quansen
    Lasang, Pongsak
    Shen, Shengmei
    IET COMPUTER VISION, 2020, 14 (03) : 84 - 91
  • [28] 3D DRIVER POSE ESTIMATION BASED ON JOINT 2D-3D NETWORK
    Yao, Zhijie
    Liu, Yazhou
    Ji, Zexuan
    Sun, Quansen
    Lasang, Pongsak
    Shen, Shengmei
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2546 - 2550
  • [29] 3D face pose tracking from an uncalibrated monocular camera
    Zhu, ZW
    Ji, Q
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 400 - 403
  • [30] 3D Head pose estimation and camera mouse implementation using a monocular video camera
    Nabati, Masoomeh
    Behrad, Alireza
    SIGNAL IMAGE AND VIDEO PROCESSING, 2015, 9 (01) : 39 - 44