HandVoxNet++: 3D Hand Shape and Pose Estimation Using Voxel-Based Neural Networks

被引:16
|
作者
Malik, Jameel [1 ,2 ]
Shimada, Soshi [3 ,4 ]
Elhayek, Ahmed [5 ]
Ali, Sk Aziz [1 ,6 ]
Theobalt, Christian [3 ]
Golyanik, Vladislav [3 ]
Stricker, Didier [1 ,6 ]
机构
[1] TU Kaiserslautern, D-67663 Kaiserslautern, Germany
[2] NUST, Islamabad 44000, Pakistan
[3] MPI Informat, Saarbrcken, Germany
[4] Saarland Informat Campus, D-66123 Saarbrcken, Germany
[5] UPM, Medina 42241, Saudi Arabia
[6] DFKI, D-67663 Kaiserslautern, Germany
关键词
3D hand shape and pose from a single depth map; voxelized hand shape; graph convolutions; TSDF; 3D data augmentation; shape registration; GCN-MeshReg; NRGA plus;
D O I
10.1109/TPAMI.2021.3122874
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. Existing methods addressing it directly regress hand meshes via 2D convolutional neural networks, which leads to artifacts due to perspective distortions in the images. To address the limitations of the existing methods, we develop HandVoxNet++, i.e., a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner. The input to our network is a 3D voxelized-depth-map-based on the truncated signed distance function (TSDF). HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology and which is the most accurate representation. The second representation is the hand surface that preserves the mesh topology. We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which does not rely on training data. In extensive evaluations on three public benchmarks, i.e., SynHand5M, depth-based HANDS19 challenge and HO-3D, the proposed HandVoxNet++ achieves the state-of-the-art performance. In this journal extension of our previous approach presented at CVPR 2020, we gain 41.09% and 13.7% higher shape alignment accuracy on SynHand5M and HANDS19 datasets, respectively. Our method is ranked first on the HANDS19 challenge dataset (Task 1: Depth-Based 3D Hand Pose Estimation) at the moment of the submission of our results to the portal in August 2020.
引用
收藏
页码:8962 / 8974
页数:13
相关论文
共 50 条
  • [41] Voxel-based 3D face representations for recognition
    Moreno, AB
    Sánchez, A
    Vélez, JF
    IWSSIP 2005: PROCEEDINGS OF THE 12TH INTERNATIONAL WORSHOP ON SYSTEMS, SIGNALS & IMAGE PROCESSING, 2005, : 283 - 287
  • [42] 3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images
    Ge, Liuhao
    Liang, Hui
    Yuan, Junsong
    Thalmann, Daniel
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5679 - 5688
  • [43] Voxel-Based Assessment of Printability of 3D Shapes
    Telea, Alexandru
    Jalba, Andrei
    MATHEMATICAL MORPHOLOGY AND ITS APPLICATIONS TO IMAGE AND SIGNAL PROCESSING, (ISMM 2011), 2011, 6671 : 393 - 404
  • [44] Dense 3D Regression for Hand Pose Estimation
    Wan, Chengde
    Probst, Thomas
    Van Gool, Luc
    Yao, Angela
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5147 - 5156
  • [45] Temporal Hints in 3D Hand Pose Estimation
    Yu, Taidong
    Cao, Zhiguo
    Xiao, Yang
    Zhang, Boshen
    Zhu, Zihao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2042 - 2047
  • [46] Multi-person 3D pose estimation from 3D cloud data using 3D convolutional neural networks
    Vasileiadis, Manolis
    Bouganis, Christos-Savvas
    Tzovaras, Dimitrios
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 185 : 12 - 23
  • [47] PCHPS: The Estimation of 3D Hand Pose and Shape using Point Cloud from a Single Depth Image
    Huang, Haozhe
    Zhuang, Zilong
    Hu, Qing
    Huang, Zizhao
    Qin, Wei
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 1231 - 1236
  • [48] Neural Voting Field for Camera-Space 3D Hand Pose Estimation
    Huang, Lin
    Lin, Chung-Ching
    Lin, Kevin
    Liang, Lin
    Wang, Lijuan
    Yuan, Junsong
    Liu, Zicheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8969 - 8978
  • [49] kVp, mA, and voxel size effect on 3D voxel-based superimposition
    Eliliwi, Manhal
    Bazina, Mohamed
    Palomo, Juan Martin
    ANGLE ORTHODONTIST, 2020, 90 (02) : 269 - 277
  • [50] Camera pose estimation using voxel-based features for autonomous vehicle localization tracking
    Lee, Sangyun
    Moon, Yeon-Kug
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 185 - 188