Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

被引:2060
|
作者
Ionescu, Catalin [1 ,2 ]
Papava, Dragos [1 ]
Olaru, Vlad [1 ]
Sminchisescu, Cristian [3 ,4 ]
机构
[1] Romanian Acad IMAR, Inst Math, RO-010702 Bucharest, Romania
[2] Univ Bonn, Fac Math & Nat Sci, D-53115 Bonn, Germany
[3] Lund Univ, Fac Engn, Dept Math, SE-22100 Lund, Sweden
[4] Inst Math Romanian Acad, Riyadh, Saudi Arabia
关键词
3D human pose estimation; human motion capture data; articulated body modeling; optimization; large-scale learning; structured prediction; Fourier kernel approximations; HUMAN POSE; CAPTURE;
D O I
10.1109/TPAMI.2013.248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms. Besides increasing the size of the datasets in the current state-of-the-art by several orders of magnitude, we also aim to complement such datasets with a diverse set of motions and poses encountered as part of typical human activities (taking photos, talking on the phone, posing, greeting, eating, etc.), with additional synchronized image, human motion capture, and time of flight (depth) data, and with accurate 3D body scans of all the subject actors involved. We also provide controlled mixed reality evaluation scenarios where 3D human models are animated using motion capture and inserted using correct 3D geometry, in complex real environments, viewed with moving cameras, and under occlusion. Finally, we provide a set of large-scale statistical models and detailed evaluation baselines for the dataset illustrating its diversity and the scope for improvement by future work in the research community. Our experiments show that our best large-scale model can leverage our full training set to obtain a 20% improvement in performance compared to a training set of the scale of the largest existing public dataset for this problem. Yet the potential for improvement by leveraging higher capacity, more complex models with our large dataset, is substantially vaster and should stimulate future research. The dataset together with code for the associated large-scale learning models, features, visualization tools, as well as the evaluation server, is available online at http://vision.imar.ro/human3.6m.
引用
收藏
页码:1325 / 1339
页数:15
相关论文
共 50 条
  • [11] 3D bioprinting of human-scale tissues
    Dustin M. Graham
    Lab Animal, 2016, 45 : 155 - 155
  • [12] 3D bioprinting of human-scale tissues
    Graham, Dustin M.
    LAB ANIMAL, 2016, 45 (05) : 155 - 155
  • [13] Large-scale 3D imaging of insects with natural color
    Qian, Jia
    Dang, Shipei
    Wang, Zhaojun
    Zhou, Xing
    Dan, Dan
    Yao, Baoli
    Tong, Yijie
    Yang, Haidong
    Lu, Yuanyuan
    Chen, Yandong
    Yang, Xingke
    Bai, Ming
    Lei, Ming
    OPTICS EXPRESS, 2019, 27 (04): : 4845 - 4857
  • [14] Modeling 3D Environments through Hidden Human Context
    Jiang, Yun
    Koppula, Hema S.
    Saxena, Ashutosh
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2040 - 2053
  • [15] 3D visual reconstruction of large scale natural sites and their fauna
    Komodakis, N
    Panagiotakis, C
    Tziritas, G
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (9-10) : 869 - 890
  • [16] Autonomous 3D Exploration in Large-Scale Environments with Dynamic Obstacles
    Wiman, Emil
    Widen, Ludvig
    Tiger, Mattias
    Heintz, Fredrik
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2389 - 2395
  • [17] 3D GIS for geo-coding human activity in micro-scale urban environments
    Lee, J
    GEOGRAPHIC INFORMATION SCIENCE, PROCEEDINGS, 2004, 3234 : 162 - 178
  • [18] Novel Large-Scale 3D Electrical Impedance Tomography Modeling of the Human Head
    Horesh, L.
    Bollhofer, M.
    Schweiger, M.
    Arridge, S. R.
    Holder, D. S.
    WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2006, VOL 14, PTS 1-6, 2007, 14 : 3858 - +
  • [19] 3D Human Motion Sensing from Multiple Cameras
    Nordin, Nadira
    Soori, Umair
    Arshad, Mohd Rizal
    ICIAS 2007: INTERNATIONAL CONFERENCE ON INTELLIGENT & ADVANCED SYSTEMS, VOLS 1-3, PROCEEDINGS, 2007, : 325 - 329
  • [20] NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
    Liu, Jun
    Shahroudy, Amir
    Perez, Mauricio
    Wang, Gang
    Duan, Ling-Yu
    Kot, Alex C.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2684 - 2701