Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

被引:2060
|
作者
Ionescu, Catalin [1 ,2 ]
Papava, Dragos [1 ]
Olaru, Vlad [1 ]
Sminchisescu, Cristian [3 ,4 ]
机构
[1] Romanian Acad IMAR, Inst Math, RO-010702 Bucharest, Romania
[2] Univ Bonn, Fac Math & Nat Sci, D-53115 Bonn, Germany
[3] Lund Univ, Fac Engn, Dept Math, SE-22100 Lund, Sweden
[4] Inst Math Romanian Acad, Riyadh, Saudi Arabia
关键词
3D human pose estimation; human motion capture data; articulated body modeling; optimization; large-scale learning; structured prediction; Fourier kernel approximations; HUMAN POSE; CAPTURE;
D O I
10.1109/TPAMI.2013.248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms. Besides increasing the size of the datasets in the current state-of-the-art by several orders of magnitude, we also aim to complement such datasets with a diverse set of motions and poses encountered as part of typical human activities (taking photos, talking on the phone, posing, greeting, eating, etc.), with additional synchronized image, human motion capture, and time of flight (depth) data, and with accurate 3D body scans of all the subject actors involved. We also provide controlled mixed reality evaluation scenarios where 3D human models are animated using motion capture and inserted using correct 3D geometry, in complex real environments, viewed with moving cameras, and under occlusion. Finally, we provide a set of large-scale statistical models and detailed evaluation baselines for the dataset illustrating its diversity and the scope for improvement by future work in the research community. Our experiments show that our best large-scale model can leverage our full training set to obtain a 20% improvement in performance compared to a training set of the scale of the largest existing public dataset for this problem. Yet the potential for improvement by leveraging higher capacity, more complex models with our large dataset, is substantially vaster and should stimulate future research. The dataset together with code for the associated large-scale learning models, features, visualization tools, as well as the evaluation server, is available online at http://vision.imar.ro/human3.6m.
引用
收藏
页码:1325 / 1339
页数:15
相关论文
共 50 条
  • [31] Human-Information Interaction in 3D Immersive Virtual Environments
    Komlodi, A.
    Hercegfi, K.
    Jozsa, E.
    Koles, M.
    3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 597 - 600
  • [32] Large-scale reconstruction of 3D structures of human chromosomes from chromosomal contact data
    Tuan Trieu
    Cheng, Jianlin
    NUCLEIC ACIDS RESEARCH, 2014, 42 (07) : e52
  • [33] Incomplete Region Estimation and Restoration of 3D Point Cloud Human Face Datasets
    Uddin, Kutub
    Jeong, Tae Hyun
    Oh, Byung Tae
    SENSORS, 2022, 22 (03)
  • [34] Methods for capturing 3D shape and advantages in the application of 3D human modeling in ergonomics studies
    Batista, Denise
    Pereira, Fernando
    OCCUPATIONAL SAFETY AND HYGIENE - SHO2013, 2013, : 67 - 69
  • [35] Deep Multitask Architecture for Integrated 2D and 3D Human Sensing
    Popa, Alin-Ionut
    Zanfir, Mihai
    Sminchisescu, Cristian
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4714 - 4723
  • [36] Graph-based Topological Exploration Planning in Large-scale 3D Environments
    Yang, Fan
    Lee, Dung-Han
    Keller, John
    Scherer, Sebastian
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 12730 - 12736
  • [37] VoxelScape: Large Scale Simulated 3D Point Cloud Dataset of Urban Traffic Environments
    Saleh, Khaled
    Hossny, Mohammed
    Abobakr, Ahmed
    Attia, Mohammed
    Iskander, Julie
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9435 - 9448
  • [38] Generating Diverse and Natural 3D Human Motions from Text
    Guo, Chuan
    Zou, Shihao
    Zuo, Xinxin
    Wang, Sen
    Ji, Wei
    Li, Xingyu
    Cheng, Li
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5142 - 5151
  • [39] Natural oscillatory modes of 3D deformation of the human brain in vivo
    Escarcega, J. D.
    Knutsen, A. K.
    Okamoto, R. J.
    Pham, D. L.
    Bayly, P., V
    JOURNAL OF BIOMECHANICS, 2021, 119
  • [40] RETRIEVAL-BASED NATURAL 3D HUMAN MOTION GENERATION
    Li, Yuqi
    Luo, Yizhi
    Wu, Song
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,