Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments

被引:2060
|
作者
Ionescu, Catalin [1 ,2 ]
Papava, Dragos [1 ]
Olaru, Vlad [1 ]
Sminchisescu, Cristian [3 ,4 ]
机构
[1] Romanian Acad IMAR, Inst Math, RO-010702 Bucharest, Romania
[2] Univ Bonn, Fac Math & Nat Sci, D-53115 Bonn, Germany
[3] Lund Univ, Fac Engn, Dept Math, SE-22100 Lund, Sweden
[4] Inst Math Romanian Acad, Riyadh, Saudi Arabia
关键词
3D human pose estimation; human motion capture data; articulated body modeling; optimization; large-scale learning; structured prediction; Fourier kernel approximations; HUMAN POSE; CAPTURE;
D O I
10.1109/TPAMI.2013.248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for training realistic human sensing systems and for evaluating the next generation of human pose estimation models and algorithms. Besides increasing the size of the datasets in the current state-of-the-art by several orders of magnitude, we also aim to complement such datasets with a diverse set of motions and poses encountered as part of typical human activities (taking photos, talking on the phone, posing, greeting, eating, etc.), with additional synchronized image, human motion capture, and time of flight (depth) data, and with accurate 3D body scans of all the subject actors involved. We also provide controlled mixed reality evaluation scenarios where 3D human models are animated using motion capture and inserted using correct 3D geometry, in complex real environments, viewed with moving cameras, and under occlusion. Finally, we provide a set of large-scale statistical models and detailed evaluation baselines for the dataset illustrating its diversity and the scope for improvement by future work in the research community. Our experiments show that our best large-scale model can leverage our full training set to obtain a 20% improvement in performance compared to a training set of the scale of the largest existing public dataset for this problem. Yet the potential for improvement by leveraging higher capacity, more complex models with our large dataset, is substantially vaster and should stimulate future research. The dataset together with code for the associated large-scale learning models, features, visualization tools, as well as the evaluation server, is available online at http://vision.imar.ro/human3.6m.
引用
收藏
页码:1325 / 1339
页数:15
相关论文
共 50 条
  • [41] Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments
    Rusu, Radu Bogdan
    KUNSTLICHE INTELLIGENZ, 2010, 24 (04): : 345 - 348
  • [42] 3D Human Pose Estimation in Weightless Environments Using a Fisheye Camera
    Minoda, Koji
    Yairi, Takehisa
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 4100 - 4105
  • [43] Visual Recovery of Saliency Maps from Human Attention in 3D Environments
    Santner, Katrin
    Fritz, Gerald
    Paletta, Lucas
    Mayer, Heinz
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 4297 - 4303
  • [44] SUNFISH®: A human-portable exploration AUV for complex 3D environments
    Richmond, Kristof
    Flesher, Chris
    Lindzey, Laura
    Tanner, Neal
    Stone, William C.
    OCEANS 2018 MTS/IEEE CHARLESTON, 2018,
  • [45] DEVELOPMENT OF 3D EQUIPMENT INTERACTION WITH PREDICTIVE DYNAMICS IN HUMAN MOTION SIMULATION
    Chung, Hyun-Joon
    Xiang, Yujiang
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2016, VOL 1A, 2016,
  • [46] 3D structure of human proteins;: from narrow focus to large-scale data gathering and back
    Heinemann, U.
    Buessow, K.
    Lenski, U.
    Mueller, U.
    Umbach, P.
    MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (08) : S7 - S7
  • [47] Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
    Azadi, Samaneh
    Shah, Akbar
    Hayes, Thomas
    Parikh, Devi
    Gupta, Sonal
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14993 - 15002
  • [48] Tracking 3D Human Pose with Large Root Node Uncertainty
    Daubney, Ben
    Xie, Xianghua
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 1321 - 1328
  • [49] 3D Human Body Models: Parametric and Generative Methods Review
    Garcia-D'Urso, Nahuel Emiliano
    Ramon Guevara, Pablo
    Azorin-Lopez, Jorge
    Fuster-Guillo, Andres
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT I, 2023, 14134 : 251 - 262
  • [50] 3D Analysis of Human Embryos and Fetuses Using Digitized Datasets From the Kyoto Collection
    Takakuwa, Tetsuya
    ANATOMICAL RECORD-ADVANCES IN INTEGRATIVE ANATOMY AND EVOLUTIONARY BIOLOGY, 2018, 301 (06): : 960 - 969