Multi-hop graph transformer network for 3D human pose estimation

被引:3
|
作者
Islam, Zaedul [1 ]
Ben Hamza, A. [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
3D human pose estimation; Graph convolutional network; Transformer; Multi-hop; Dilated convolution;
D O I
10.1016/j.jvcir.2024.104174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate 3D human pose estimation is a challenging task due to occlusion and depth ambiguity. In this paper, we introduce a multi -hop graph transformer network designed for 2D -to -3D human pose estimation in videos by leveraging the strengths of multi-head self-attention and multi -hop graph convolutional networks with disentangled neighborhoods to capture spatio-temporal dependencies and handle long-range interactions. The proposed network architecture consists of a graph attention block composed of stacked layers of multi-head self-attention and graph convolution with learnable adjacency matrix, and a multi -hop graph convolutional block comprised of multi -hop convolutional and dilated convolutional layers. The combination of multi-head self-attention and multi -hop graph convolutional layers enables the model to capture both local and global dependencies, while the integration of dilated convolutional layers enhances the model's ability to handle spatial details required for accurate localization of the human body joints. Extensive experiments demonstrate the effectiveness and generalization ability of our model, achieving competitive performance on benchmark datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Hybrid Attention MLP-Graph Network for 3D Human Pose Estimation
    Qiu, Feiyue
    Sun, Lin
    Peng, Delong
    Zhou, Jian
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 243 - 247
  • [22] SEMANTIC MAPPING OF INCREMENTAL 3D POINT CLOUDS BASED ON MULTI-HOP GRAPH ATTENTION NETWORK
    Chen, ShunTong
    Xu, JiaChen
    Yuan, Xia
    Zhao, ChunXia
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2865 - 2869
  • [23] HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
    Cheng, Wencan
    Kim, Eunji
    Ko, Jong Hwan
    COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 : 35 - 52
  • [24] PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation
    Zhu, Yiran
    Xu, Xing
    Shen, Fumin
    Ji, Yanli
    Gao, Lianli
    Shen, Heng Tao
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1359 - 1365
  • [25] U-shaped spatial–temporal transformer network for 3D human pose estimation
    Honghong Yang
    Longfei Guo
    Yumei Zhang
    Xiaojun Wu
    Machine Vision and Applications, 2022, 33
  • [26] Adaptive Multi-View and Temporal Fusing Transformer for 3D Human Pose Estimation
    Shuai, Hui
    Wu, Lele
    Liu, Qingshan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4122 - 4135
  • [27] Multi-scale spatial-temporal transformer for 3D human pose estimation
    Wu, Yongpeng
    Gao, Junna
    2021 5TH INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2021), 2021, : 242 - 247
  • [28] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation
    Zhou, Kangkang
    Zhang, Lijun
    Lu, Feng
    Zhou, Xiang-Dong
    Shi, Yu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7512 - 7520
  • [29] GHand: A Graph Convolution Network for 3D Hand Pose Estimation
    Wang, Pengsheng
    Xue, Guangtao
    Li, Pin
    Kim, Jinman
    Sheng, Bin
    Mao, Lijuan
    ADVANCES IN COMPUTER GRAPHICS, CGI 2020, 2020, 12221 : 374 - 381
  • [30] MPA-GNet: multi-scale parallel adaptive graph network for 3D human pose estimation
    Jia, Ru
    Yang, Honghong
    Zhao, Li
    Wu, Xiaojun
    Zhang, Yumei
    VISUAL COMPUTER, 2024, 40 (08): : 5883 - 5899