HandVoxNet++: 3D Hand Shape and Pose Estimation Using Voxel-Based Neural Networks

被引：16

作者：

Malik, Jameel ^{[1
,2
]}

Shimada, Soshi ^{[3
,4
]}

Elhayek, Ahmed ^{[5
]}

Ali, Sk Aziz ^{[1
,6
]}

Theobalt, Christian ^{[3
]}

Golyanik, Vladislav ^{[3
]}

Stricker, Didier ^{[1
,6
]}

机构：

[1] TU Kaiserslautern, D-67663 Kaiserslautern, Germany

[2] NUST, Islamabad 44000, Pakistan

[3] MPI Informat, Saarbrcken, Germany

[4] Saarland Informat Campus, D-66123 Saarbrcken, Germany

[5] UPM, Medina 42241, Saudi Arabia

[6] DFKI, D-67663 Kaiserslautern, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 12期

关键词：

3D hand shape and pose from a single depth map; voxelized hand shape; graph convolutions; TSDF; 3D data augmentation; shape registration; GCN-MeshReg; NRGA plus;

D O I：

10.1109/TPAMI.2021.3122874

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. Existing methods addressing it directly regress hand meshes via 2D convolutional neural networks, which leads to artifacts due to perspective distortions in the images. To address the limitations of the existing methods, we develop HandVoxNet++, i.e., a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner. The input to our network is a 3D voxelized-depth-map-based on the truncated signed distance function (TSDF). HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology and which is the most accurate representation. The second representation is the hand surface that preserves the mesh topology. We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which does not rely on training data. In extensive evaluations on three public benchmarks, i.e., SynHand5M, depth-based HANDS19 challenge and HO-3D, the proposed HandVoxNet++ achieves the state-of-the-art performance. In this journal extension of our previous approach presented at CVPR 2020, we gain 41.09% and 13.7% higher shape alignment accuracy on SynHand5M and HANDS19 datasets, respectively. Our method is ranked first on the HANDS19 challenge dataset (Task 1: Depth-Based 3D Hand Pose Estimation) at the moment of the submission of our results to the portal in August 2020.

引用

页码：8962 / 8974

页数：13

共 50 条

[41] Voxel-based 3D face representations for recognition
Moreno, AB
Sánchez, A
Vélez, JF
IWSSIP 2005: PROCEEDINGS OF THE 12TH INTERNATIONAL WORSHOP ON SYSTEMS, SIGNALS & IMAGE PROCESSING, 2005, : 283 - 287
[42] 3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images
Ge, Liuhao
Liang, Hui
Yuan, Junsong
Thalmann, Daniel
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5679 - 5688
[43] Voxel-Based Assessment of Printability of 3D Shapes
Telea, Alexandru
Jalba, Andrei
MATHEMATICAL MORPHOLOGY AND ITS APPLICATIONS TO IMAGE AND SIGNAL PROCESSING, (ISMM 2011), 2011, 6671 : 393 - 404
[44] Dense 3D Regression for Hand Pose Estimation
Wan, Chengde
Probst, Thomas
Van Gool, Luc
Yao, Angela
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5147 - 5156
[45] Temporal Hints in 3D Hand Pose Estimation
Yu, Taidong
Cao, Zhiguo
Xiao, Yang
Zhang, Boshen
Zhu, Zihao
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2042 - 2047
[46] Multi-person 3D pose estimation from 3D cloud data using 3D convolutional neural networks
Vasileiadis, Manolis
Bouganis, Christos-Savvas
Tzovaras, Dimitrios
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 185 : 12 - 23
[47] PCHPS: The Estimation of 3D Hand Pose and Shape using Point Cloud from a Single Depth Image
Huang, Haozhe
Zhuang, Zilong
Hu, Qing
Huang, Zizhao
Qin, Wei
2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 1231 - 1236
[48] Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Huang, Lin
Lin, Chung-Ching
Lin, Kevin
Liang, Lin
Wang, Lijuan
Yuan, Junsong
Liu, Zicheng
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8969 - 8978
[49] kVp, mA, and voxel size effect on 3D voxel-based superimposition
Eliliwi, Manhal
Bazina, Mohamed
Palomo, Juan Martin
ANGLE ORTHODONTIST, 2020, 90 (02) : 269 - 277
[50] Camera pose estimation using voxel-based features for autonomous vehicle localization tracking
Lee, Sangyun
Moon, Yeon-Kug
2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 185 - 188

← 1 2 3 4 5 →