Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

被引：2060

作者：

Ionescu, Catalin ^{[1
,2
]}

Papava, Dragos ^{[1
]}

Olaru, Vlad ^{[1
]}

Sminchisescu, Cristian ^{[3
,4
]}

机构：

[1] Romanian Acad IMAR, Inst Math, RO-010702 Bucharest, Romania

[2] Univ Bonn, Fac Math & Nat Sci, D-53115 Bonn, Germany

[3] Lund Univ, Fac Engn, Dept Math, SE-22100 Lund, Sweden

[4] Inst Math Romanian Acad, Riyadh, Saudi Arabia

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2014年 / 36卷 / 07期

关键词：

3D human pose estimation; human motion capture data; articulated body modeling; optimization; large-scale learning; structured prediction; Fourier kernel approximations; HUMAN POSE; CAPTURE;

D O I：

10.1109/TPAMI.2013.248

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms. Besides increasing the size of the datasets in the current state-of-the-art by several orders of magnitude, we also aim to complement such datasets with a diverse set of motions and poses encountered as part of typical human activities (taking photos, talking on the phone, posing, greeting, eating, etc.), with additional synchronized image, human motion capture, and time of flight (depth) data, and with accurate 3D body scans of all the subject actors involved. We also provide controlled mixed reality evaluation scenarios where 3D human models are animated using motion capture and inserted using correct 3D geometry, in complex real environments, viewed with moving cameras, and under occlusion. Finally, we provide a set of large-scale statistical models and detailed evaluation baselines for the dataset illustrating its diversity and the scope for improvement by future work in the research community. Our experiments show that our best large-scale model can leverage our full training set to obtain a 20% improvement in performance compared to a training set of the scale of the largest existing public dataset for this problem. Yet the potential for improvement by leveraging higher capacity, more complex models with our large dataset, is substantially vaster and should stimulate future research. The dataset together with code for the associated large-scale learning models, features, visualization tools, as well as the evaluation server, is available online at http://vision.imar.ro/human3.6m.

引用

页码：1325 / 1339

页数：15

共 50 条

[31] Human-Information Interaction in 3D Immersive Virtual Environments
Komlodi, A.
Hercegfi, K.
Jozsa, E.
Koles, M.
3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 597 - 600
[32] Large-scale reconstruction of 3D structures of human chromosomes from chromosomal contact data
Tuan Trieu
Cheng, Jianlin
NUCLEIC ACIDS RESEARCH, 2014, 42 (07) : e52
[33] Incomplete Region Estimation and Restoration of 3D Point Cloud Human Face Datasets
Uddin, Kutub
Jeong, Tae Hyun
Oh, Byung Tae
SENSORS, 2022, 22 (03)
[34] Methods for capturing 3D shape and advantages in the application of 3D human modeling in ergonomics studies
Batista, Denise
Pereira, Fernando
OCCUPATIONAL SAFETY AND HYGIENE - SHO2013, 2013, : 67 - 69
[35] Deep Multitask Architecture for Integrated 2D and 3D Human Sensing
Popa, Alin-Ionut
Zanfir, Mihai
Sminchisescu, Cristian
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4714 - 4723
[36] Graph-based Topological Exploration Planning in Large-scale 3D Environments
Yang, Fan
Lee, Dung-Han
Keller, John
Scherer, Sebastian
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 12730 - 12736
[37] VoxelScape: Large Scale Simulated 3D Point Cloud Dataset of Urban Traffic Environments
Saleh, Khaled
Hossny, Mohammed
Abobakr, Ahmed
Attia, Mohammed
Iskander, Julie
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9435 - 9448
[38] Generating Diverse and Natural 3D Human Motions from Text
Guo, Chuan
Zou, Shihao
Zuo, Xinxin
Wang, Sen
Ji, Wei
Li, Xingyu
Cheng, Li
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5142 - 5151
[39] Natural oscillatory modes of 3D deformation of the human brain in vivo
Escarcega, J. D.
Knutsen, A. K.
Okamoto, R. J.
Pham, D. L.
Bayly, P., V
JOURNAL OF BIOMECHANICS, 2021, 119
[40] RETRIEVAL-BASED NATURAL 3D HUMAN MOTION GENERATION
Li, Yuqi
Luo, Yizhi
Wu, Song
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →