Neural Descent for Visual 3D Human Pose and Shape

被引:35
|
作者
Zanfir, Andrei [1 ]
Bazavan, Eduard Gabriel [1 ]
Zanfir, Mihai [1 ]
Freeman, William T. [1 ]
Sukthankar, Rahul [1 ]
Sminchisescu, Cristian [1 ]
机构
[1] Google Res, Bangalore, Karnataka, India
关键词
D O I
10.1109/CVPR46437.2021.01425
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present deep neural network methodology to reconstruct the 3d pose and shape of people, including hand gestures and facial expression, given an input RGB image. We rely on a recently introduced, expressive full body statistical 3d human model, GHUM, trained end-to-end, and learn to reconstruct its pose and shape state in a self-supervised regime. Central to our methodology, is a learning to learn and optimize approach, referred to as HUman Neural Descent (HUND), which avoids both second-order differentiation when training the model parameters, and expensive state gradient descent in order to accurately minimize a semantic differentiable rendering loss at test time. Instead, we rely on novel recurrent stages to update the pose and shape parameters such that not only losses are minimized effectively, but the process is meta-regularized in order to ensure endprogress. HUND's symmetry between training and testing makes it the first 3d human sensing architecture to natively support different operating regimes including self-supervised ones. In diverse tests, we show that HUND achieves very competitive results in datasets like H3.6M and 3DPW, as well as good quality 3d reconstructions for complex imagery collected in-the-wild.
引用
收藏
页码:14479 / 14488
页数:10
相关论文
共 50 条
  • [41] The visual perception of 3D shape
    Todd, JT
    TRENDS IN COGNITIVE SCIENCES, 2004, 8 (03) : 115 - 121
  • [42] Learning Latent Representations of 3D Human Pose with Deep Neural Networks
    Katircioglu, Isinsu
    Tekin, Bugra
    Salzmann, Mathieu
    Lepetit, Vincent
    Fua, Pascal
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (12) : 1326 - 1341
  • [43] Sparse Representation and Convolutional Neural Networks for 3D Human Pose Estimation
    Alikarami, Hassan
    Yaghmaee, Farzin
    Fadaeieslam, Mohammad Javad
    2017 3RD IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2017, : 188 - 192
  • [44] Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction
    Andriluka, Mykhaylo
    Tabanpour, Baruch
    Freeman, C. Daniel
    Sminchisescu, Cristian
    COMPUTER VISION - ECCV 2024, PT LXXXIV, 2025, 15142 : 320 - 336
  • [45] Learning Latent Representations of 3D Human Pose with Deep Neural Networks
    Isinsu Katircioglu
    Bugra Tekin
    Mathieu Salzmann
    Vincent Lepetit
    Pascal Fua
    International Journal of Computer Vision, 2018, 126 : 1326 - 1341
  • [46] 3D visual methods for object pose measurement
    Hao, YM
    Zhu, F
    Ou, JJ
    VISUALIZATION AND OPTIMIZATION TECHNIQUES, 2001, 4553 : 78 - 82
  • [47] Revitalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation
    Fan, Taosha
    Alwala, Kalyan Vasudev
    Xiang, Donglai
    Xu, Weipeng
    Murphey, Todd
    Mukadam, Mustafa
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11437 - 11446
  • [48] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [49] DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation
    Zhang, Hongwen
    Cao, Jie
    Lu, Guo
    Ouyang, Wanli
    Sun, Zhenan
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 935 - 944
  • [50] Bidirectional temporal feature for 3D human pose and shape estimation from a video
    Sun, Libo
    Tang, Ting
    Qu, Yuke
    Qin, Wenhu
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)