GTIGNet: Global Topology Interaction Graphormer Network for 3D hand pose estimation

被引:0
|
作者
Liu, Yanjun [1 ]
Fan, Wanshu [1 ]
Wang, Cong [2 ]
Wen, Shixi [3 ]
Yang, Xin [4 ]
Zhang, Qiang [1 ,4 ]
Wei, Xiaopeng [4 ]
Zhou, Dongsheng [1 ,4 ]
机构
[1] Dalian Univ, Sch Software Engn, Natl & Local Joint Engn Lab Comp Aided Design, Dalian, Peoples R China
[2] Ctr Adv Reliabil & Safety CAiRS, Hong Kong, Peoples R China
[3] Dalian Univ, Sch Informat Engn, Dalian, Peoples R China
[4] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
基金
中国国家自然科学基金;
关键词
3D hand pose estimation; Transformer; GCN; Topology; 3D computer vision; SIGN-LANGUAGE RECOGNITION;
D O I
10.1016/j.neunet.2025.107221
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating 3D hand poses from monocular RGB images presents a series of challenges, including complex hand structures, self-occlusions, and depth ambiguities. Existing methods often fall short of capturing the longdistance dependencies of skeletal and non-skeletal connections for hand joints. To address these limitations, we introduce the Global Topology Interaction Graphormer Network (GTIGNet), a novel deep learning architecture designed to improve 3D hand pose estimation. Our model incorporates a Context-Aware Attention Block (CAAB) within the 2D pose estimator to enhance the extraction of multi-scale features, yielding more accurate 2D joint heatmaps to support the task that followed. Additionally, we introduce a High-Order Graphormer that explicitly and implicitly models the topological structure of hand joints, thereby enhancing feature interaction. Ablation studies confirm the effectiveness of our approach, and experimental results on four challenging datasets, Rendered Hand Dataset (RHD), Stereo Hand Pose Benchmark (STB), First-Person Hand Action Benchmark (FPHA), and FreiHAND Dataset, indicate that GTIGNet achieves state-of-the-art performance in 3D hand pose estimation. Notably, our model achieves an impressive Mean Per Joint Position Error (MPJPE) of 9.98 mm on RHD, 6.12 mm on STB, 11.15 mm on FPHA and 10.97 mm on FreiHAND.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] HandGCNFormer: A Novel Topology-Aware Transformer Network for 3D Hand Pose Estimation
    Wang, Yintong
    Chen, LiLi
    Li, Jiamao
    Zhang, Xiaolin
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5664 - 5673
  • [2] PEAN: 3D Hand Pose Estimation Adversarial Network
    Sun, Linhui
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1251 - 1258
  • [3] CASCADED POINT NETWORK FOR 3D HAND POSE ESTIMATION
    Dou, Yikun
    Wang, Xuguang
    Zhu, Yuying
    Deng, Xiaoming
    Ma, Cuixia
    Chang, Liang
    Wang, Hongan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1982 - 1986
  • [4] Coot optimization based Enhanced Global Pyramid Network for 3D hand pose estimation
    Malavath, Pallavi
    Devarakonda, Nagaraju
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (04):
  • [5] GHand: A Graph Convolution Network for 3D Hand Pose Estimation
    Wang, Pengsheng
    Xue, Guangtao
    Li, Pin
    Kim, Jinman
    Sheng, Bin
    Mao, Lijuan
    ADVANCES IN COMPUTER GRAPHICS, CGI 2020, 2020, 12221 : 374 - 381
  • [6] Accurate 3D hand pose estimation network utilizing joints information
    Zhang, Xiongquan
    Huang, Shiliang
    Ye, Zhongfu
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 90
  • [7] SARN: Shifted Attention Regression Network for 3D Hand Pose Estimation
    Zhu, Chenfei
    Hu, Boce
    Chen, Jiawei
    Ai, Xupeng
    Agrawal, Sunil K. K.
    BIOENGINEERING-BASEL, 2023, 10 (02):
  • [8] HMTNet: 3D Hand Pose Estimation From Single Depth Image Based on Hand Morphological Topology
    Zhou, Weiguo
    Jiang, Xin
    Chen, Chen
    Mei, Sijia
    Liu, Yun-Hui
    IEEE SENSORS JOURNAL, 2020, 20 (11) : 6004 - 6011
  • [9] Dense 3D Regression for Hand Pose Estimation
    Wan, Chengde
    Probst, Thomas
    Van Gool, Luc
    Yao, Angela
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5147 - 5156
  • [10] Temporal Hints in 3D Hand Pose Estimation
    Yu, Taidong
    Cao, Zhiguo
    Xiao, Yang
    Zhang, Boshen
    Zhu, Zihao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2042 - 2047