Monocular Expressive 3D Human Reconstruction of Multiple People

被引:0
|
作者
Zhao, Zhenghao [1 ]
Tang, Hao [2 ]
Wan, Joy [3 ]
Yan, Yan [1 ]
机构
[1] Illinois Inst Technol, Chicago, IL 60616 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Univ Illinois, Urbana, IL USA
关键词
3D Pose Estimation; Whole-body Pose Estimation; Multi-person Pose Estimation;
D O I
10.1145/3652583.3658092
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Whole-body pose estimation aims to regress human pose models that include the body, hand, and facial details from RGB images. While the task of whole-body mesh recovery has been extensively studied in recent literature, the focus has predominantly been on human mesh recovery for a single person, despite the frequent occurrence of multiple people in practical scenarios. Similar to body-only cases, such single-person whole-body pose estimation methods often fail in the multiple-people problem for two reasons: (i) Given the ambiguous bounding box, which could contain more than one instance, it is difficult for single-person-oriented methods to regress the body mesh model of the target person. (ii) Single-person pose estimation approaches neglect the person-person occlusions and the depth order among instances, thus generating interpenetrated models. In this paper, we propose the Multi-person Expressive POse (MEPO) model, which exploits expressive 3D human model reconstruction for multiple people. To our best knowledge, our model is the first multi-person whole-body mesh reconstruction model, which is intensified by heatmap, depthmap, and depth order loss. We propose the Heatmap Enhancement Net (HENet) to leverage the heatmap information to assist the model in concentrating on the target person in crowded multi-person cases, while the depthmap delivers depth information of the image. Furthermore, we impose a depth order loss to recover human mesh precisely for overlapped people. In our experiments, we evaluate our model on multiple challenging datasets, including AGORA, which consists of complex occlusions similar to real-world scenarios. Our method has a significant performance improvement compared with the state-of-the-art pose estimation methods.
引用
收藏
页码:423 / 432
页数:10
相关论文
共 50 条
  • [21] Monocular 3D Face Reconstruction with Joint 2D and 3D Constraints
    Cui, Huili
    Yang, Jing
    Lai, Yu-Kun
    Li, Kun
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 129 - 141
  • [22] Putting People in their Place: Monocular Regression of 3D People in Depth
    Sun, Yu
    Liu, Wu
    Bao, Qian
    Fu, Yili
    Mei, Tao
    Black, Michael J.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13233 - 13242
  • [23] MotioNet: 3D Human motion reconstruction from monocular video with skeleton consistency
    Shi, Mingyi
    Aberman, Kfir
    Aristidou, Andreas
    Komura, Taku
    Lischinski, Dani
    Cohen-Or, Daniel
    Chen, Baoquan
    ACM Transactions on Graphics, 2020, 40 (01):
  • [24] Automatic reconstruction of 3D human arm motion from a monocular image sequence
    Valentina Filova
    Franc Solina
    Jadran Lenarčič
    Machine Vision and Applications, 1998, 10 : 223 - 231
  • [25] HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos
    Xue, Lixin
    Guo, Chen
    Zheng, Chengwei
    Wang, Fangjinghua
    Jiang, Tianjian
    Ho, Hsuan-, I
    Kaufmann, Manuel
    Song, Jie
    Hilliges, Otmar
    COMPUTER VISION - ECCV 2024, PT LXXII, 2025, 15130 : 429 - 448
  • [26] Monocular 3D human body reconstruction towards depth augmentation of television sequences
    Sappa, A
    Aifanti, N
    Malassiotis, S
    Strintzis, MG
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 325 - 328
  • [27] 3D reconstruction of human skeleton from single images or monocular video sequences
    Remondino, F
    Roditakis, A
    PATTERN RECOGNITION, PROCEEDINGS, 2003, 2781 : 100 - 107
  • [28] MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency
    Shi, Mingyi
    Aberman, Kfir
    Aristidou, Andreas
    Komura, Taku
    Lischinski, Dani
    Cohen-Or, Daniel
    Chen, Baoquan
    ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (01):
  • [29] Automatic reconstruction of 3D human arm motion from a monocular image sequence
    Filova, V
    Solina, F
    Lenarcic, J
    MACHINE VISION AND APPLICATIONS, 1998, 10 (5-6) : 223 - 231
  • [30] 3D reconstruction of a human body from multiple viewpoints
    Yamauchi, Koichiro
    Kameshima, Hideto
    Saito, Hideo
    Sato, Yukio
    ADVANCES IN IMAGE AND VIDEO TECHNOLOGY, PROCEEDINGS, 2007, 4872 : 439 - 448