HandVoxNet++: 3D Hand Shape and Pose Estimation Using Voxel-Based Neural Networks

被引：17

作者：

Malik, Jameel ^{[1
,2
]}

Shimada, Soshi ^{[3
,4
]}

Elhayek, Ahmed ^{[5
]}

Ali, Sk Aziz ^{[1
,6
]}

Theobalt, Christian ^{[3
]}

Golyanik, Vladislav ^{[3
]}

Stricker, Didier ^{[1
,6
]}

机构：

[1] TU Kaiserslautern, D-67663 Kaiserslautern, Germany

[2] NUST, Islamabad 44000, Pakistan

[3] MPI Informat, Saarbrcken, Germany

[4] Saarland Informat Campus, D-66123 Saarbrcken, Germany

[5] UPM, Medina 42241, Saudi Arabia

[6] DFKI, D-67663 Kaiserslautern, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 12期

关键词：

3D hand shape and pose from a single depth map; voxelized hand shape; graph convolutions; TSDF; 3D data augmentation; shape registration; GCN-MeshReg; NRGA plus;

D O I：

10.1109/TPAMI.2021.3122874

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. Existing methods addressing it directly regress hand meshes via 2D convolutional neural networks, which leads to artifacts due to perspective distortions in the images. To address the limitations of the existing methods, we develop HandVoxNet++, i.e., a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner. The input to our network is a 3D voxelized-depth-map-based on the truncated signed distance function (TSDF). HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology and which is the most accurate representation. The second representation is the hand surface that preserves the mesh topology. We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which does not rely on training data. In extensive evaluations on three public benchmarks, i.e., SynHand5M, depth-based HANDS19 challenge and HO-3D, the proposed HandVoxNet++ achieves the state-of-the-art performance. In this journal extension of our previous approach presented at CVPR 2020, we gain 41.09% and 13.7% higher shape alignment accuracy on SynHand5M and HANDS19 datasets, respectively. Our method is ranked first on the HANDS19 challenge dataset (Task 1: Depth-Based 3D Hand Pose Estimation) at the moment of the submission of our results to the portal in August 2020.

引用

页码：8962 / 8974

页数：13

共 50 条

[31] Mobile robot control using 3D hand pose estimation
Hoshino, Kiyoshi
Kasahara, Takuya
Igo, Naoki
Tomida, Motomasa
Tanimoto, Takanobu
Mukai, Toshimitsu
Brossard, Gilles
Kotani, Hajime
TENTH INTERNATIONAL CONFERENCE ON QUALITY CONTROL BY ARTIFICIAL VISION, 2011, 8000
[32] Improvements in 3D Hand Pose Estimation Using Synthetic Data
Kanis, Jakub
Ryumin, Dmitry
Krnoul, Zdenek
INTERACTIVE COLLABORATIVE ROBOTICS, ICR 2018, 2018, 11097 : 105 - 115
[33] 3D reconstruction for maxillary anterior tooth crown based on shape and pose estimation networks
Feng, Yuan
Tao, BaoXin
Fan, JiaCheng
Wang, ShiGang
Mo, JinQiu
Wu, YiQun
Liang, QingHua
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (08) : 1405 - 1416
[34] 3D reconstruction for maxillary anterior tooth crown based on shape and pose estimation networks
Yuan Feng
BaoXin Tao
JiaCheng Fan
ShiGang Wang
JinQiu Mo
YiQun Wu
QingHua Liang
International Journal of Computer Assisted Radiology and Surgery, 2023, 18 (8) : 1405 - 1416
[35] Review on 3D Hand Pose Estimation Based on a RGB Image
Xiao Y.
Liu Y.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (02): : 161 - 172
[36] The Effects Of Non-linear Operators In Voxel-Based Deep Neural Networks For 3D Style Reconstruction
Friedrich, Timo
Wollstadt, Patricia
Menzel, Stefan
2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 1460 - 1468
[37] 3D Pose Regression using Convolutional Neural Networks
Mahendran, Siddharth
Ali, Haider
Vidal, Rene
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 494 - 495
[38] Semi-Supervised 3D Hand Shape and Pose Estimation with Label Propagation
Kaviani, Samira
Rahimi, Amir
Hartley, Richard
2021 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA 2021), 2021, : 1 - 8
[39] 3D interacting hand pose and shape estimation from a single RGB image
Gao, Chengying
Yang, Yujia
Li, Wensheng
NEUROCOMPUTING, 2022, 474 : 25 - 36
[40] 3D Pose Regression using Convolutional Neural Networks
Mahendran, Siddharth
Ali, Haider
Vidal, Rene
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 2174 - 2182

← 1 2 3 4 5 →