Practical 3D human skeleton tracking based on multi-view and multi-Kinect fusion

被引：0

作者：

Manh-Hung Nguyen

Ching-Chun Hsiao

Wen-Huang Cheng

Ching-Chun Huang

机构：

[1] HCMC University of Technology and Education,Faculty of Electrical Electronic Engineering

[2] National Yang Ming Chiao Tung University,Department of Computer Science

[3] National Yang Ming Chiao Tung University,Institute of Electronics

来源：

Multimedia Systems | 2022年 / 28卷

关键词：

Multi-Kinect skeleton tracking; OpenPose; Sensor fusion; Left–right confusion; Self-occlusion; Lost tracking;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper, we proposed a multi-view system for 3D human skeleton tracking based on multi-cue fusion. Multiple Kinect version 2 cameras are applied to build up a low-cost system. Though Kinect cameras can detect 3D skeleton from their depth sensors, some challenges of skeleton extraction still exist, such as left–right confusion and severe self-occlusion. Moreover, human skeleton tracking systems often have difficulty in dealing with lost tracking. These challenges make robust 3D skeleton tracking nontrivial. To address these challenges in a unified framework, we first correct the skeleton's left–right ambiguity by referring to the human joints extracted by OpenPose. Unlike Kinect, and OpenPose extracts target joints by learning-based image analysis to differentiate a person's front side and backside. With help from 2D images, we can correct the left–right skeleton confusion. On the other hand, we find that self-occlusion severely degrades Kinect joint detection owing to incorrect joint depth estimation. To alleviate the problem, we reconstruct a reference 3D skeleton by back-projecting the corresponding 2D OpenPose joints from multiple cameras. The reconstructed joints are less sensitive to occlusion and can be served as 3D anchors for skeleton fusion. Finally, we introduce inter-joint constraints into our probabilistic skeleton tracking framework to trace all joints simultaneously. Unlike conventional methods that treat each joint individually, neighboring joints are utilized to position each other. In this way, when joints are missing due to occlusion, the inter-joint constraints can ensure the skeleton consistency and preserve the length between neighboring joints. In the end, we evaluate our method with five challenging actions by building a real-time demo system. It shows that the system can track skeletons stably without error propagation and vibration. The experimental results also reveal that the average localization error is smaller than that of conventional methods.

引用

页码：529 / 552

页数：23

共 50 条

[41] MHFP: Multi-view based hierarchical fusion pooling method for 3D shape recognition
Liang, Qi
Li, Qiang
Zhang, Lihu
Mi, Haixiao
Nie, Weizhi
Li, Xuanya
PATTERN RECOGNITION LETTERS, 2021, 150 : 214 - 220
[42] 3D object detection based on DST fusion multi-view fuzzy reasoning assignment
Zhang C.-F.
Li C.-W.-L.
Zou Y.-Q.
Jin N.
Kongzhi yu Juece/Control and Decision, 2021, 36 (04): : 867 - 875
[43] MLOD: A multi-view 3D object detection based on robust feature fusion method
Deng, Jian
Czarnecki, Krzysztof
2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 279 - 284
[44] Sequential Fusion of Multi-view Video Frames for 3D Scene Generation
Sun, Weilin
Li, Xiangxian
Li, Manyi
Wang, Yuqing
Zheng, Yuze
Meng, Xiangxu
Meng, Lei
ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 597 - 608
[45] Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification
Liu, An-An
Hu, Nian
Song, Dan
Guo, Fu-Bin
Zhou, He-Yu
Hao, Tong
IEEE ACCESS, 2019, 7 : 153021 - 153030
[46] Multi-View Token Clustering and Fusion for 3D Object Recognition and Retrieval
Fan, Linlong
Ge, Yanqi
Li, Wen
Duan, Lixin
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1145 - 1150
[47] AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection
Huang, Yuxiao
Huang, Zhicong
Zhao, Jingwen
Hu, Haifeng
Chen, Dihu
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (01)
[48] Visible and infrared tracking based on multi-view multi-kernel fusion model
Xiao Yun
Zhongliang Jing
Bo Jin
Optical Review, 2016, 23 : 244 - 253
[49] A Multi-view 3D Human Pose Estimation Algorithm Based On Positional Attention
Beijing University of Posts and Telecommunications, Beijing, China
Int. Conf. Intell. Comput. Signal Process., ICSP, (125-128):
[50] Visible and infrared tracking based on multi-view multi-kernel fusion model
Yun, Xiao
Jing, Zhongliang
Jin, Bo
OPTICAL REVIEW, 2016, 23 (02) : 244 - 253

← 1 2 3 4 5 →