Skeleton Cluster Tracking for robust multi-view multi-person 3D human pose estimation

被引:3
|
作者
Niu, Zehai [1 ]
Lu, Ke [1 ,2 ]
Xue, Jian [1 ]
Wang, Jinbao [3 ,4 ]
机构
[1] Univ Chinese Acad Sci, Sch Engn Sci, 19A Yuquan Rd, Beijing 100049, Peoples R China
[2] Peng Cheng Lab, Vanke Cloud City Phase I Bldg 8,Xili St, Shenzhen 518055, Guangdong, Peoples R China
[3] Shenzhen Univ, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen 518060, Peoples R China
[4] Guangdong Prov Key Lab Intelligent Informat Proc, Shenzhen 518060, Peoples R China
关键词
3D human pose estimation; Motion capture; Deep learning;
D O I
10.1016/j.cviu.2024.104059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multi -view 3D human pose estimation task relies on 2D human pose estimation for each view; however, severe occlusion, truncation, and human interaction lead to incorrect 2D human pose estimation for some views. The traditional "Matching-Lifting-Tracking"paradigm amplifies the incorrect 2D human pose into an incorrect 3D human pose, which significantly challenges the robustness of multi -view 3D human pose estimation. In this paper, we propose a novel method that tackles the inherent difficulties of the traditional paradigm. This method is rooted in the newly devised "Skeleton Pooling -Clustering -Tracking (SPCT)"paradigm. It initiates a 2D human pose estimation for each perspective. Then a symmetrical dilated network is created for skeleton pool estimation. Upon clustering the skeleton pool, we introduce and implement an innovative tracking method that is explicitly designed for the SPCT paradigm. The tracking method refines and filters the skeleton clusters, thereby enhancing the robustness of the multi -person 3D human pose estimation results. By coupling the skeleton pool with the tracking refinement process, our method obtains high -quality multi -person 3D human pose estimation results despite severe occlusions that produce erroneous 2D and 3D estimates. By employing the proposed SPCT paradigm and a computationally efficient network architecture, our method outperformed existing approaches regarding robustness on the Shelf, 4D Association, and CMU Panoptic datasets, and could be applied in practical scenarios such as markerless motion capture and animation production.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] A unified multi-view multi-person tracking framework
    Yang, Fan
    Odashima, Shigeyuki
    Yamao, Sosuke
    Fujimoto, Hiroaki
    Masui, Shoichi
    Jiang, Shan
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (01): : 137 - 160
  • [22] A unified multi-view multi-person tracking framework
    Fan Yang
    Shigeyuki Odashima
    Sosuke Yamao
    Hiroaki Fujimoto
    Shoichi Masui
    Shan Jiang
    Computational Visual Media, 2024, 10 : 137 - 160
  • [23] Center point to pose: Multiple views 3D human pose estimation for multi-person
    Liu, Huan
    Wu, Jian
    He, Rui
    PLOS ONE, 2022, 17 (09):
  • [24] Shape-aware Multi-Person Pose Estimation from Multi-View Images
    Dong, Zijian
    Song, Jie
    Chen, Xu
    Guo, Chen
    Hilliges, Otmar
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11138 - 11148
  • [25] Multi-view Pictorial Structures for 3D Human Pose Estimation
    Amin, Sikandar
    Andriluka, Mykhaylo
    Rohrbach, Marcus
    Schiele, Bernt
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2013, 2013,
  • [26] Multi-view 3D Human Pose Estimation in Complex Environment
    M. Hofmann
    D. M. Gavrila
    International Journal of Computer Vision, 2012, 96 : 103 - 124
  • [27] Generative Multi-View Based 3D Human Pose Estimation
    Sabri, Motaz
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY, SIET 2021, 2021, : 2 - 9
  • [28] PROGRESSIVE MULTI-VIEW FUSION FOR 3D HUMAN POSE ESTIMATION
    Zhang, Lijun
    Zhou, Kangkang
    Liu, Liangchen
    Li, Zhenghao
    Zhao, Xunyi
    Zhou, Xiang-Dong
    Shi, Yu
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1600 - 1604
  • [29] Multi-view 3D Human Pose Estimation in Complex Environment
    Hofmann, M.
    Gavrila, D. M.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2012, 96 (01) : 103 - 124
  • [30] Markerless multi-view 3D human pose estimation: A survey
    Nogueira, Ana Filipa Rodrigues
    Oliveira, Helder P.
    Teixeira, Luis F.
    IMAGE AND VISION COMPUTING, 2025, 155