PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation

被引:0
|
作者
Zhu, Yiran [1 ]
Xu, Xing [1 ]
Shen, Fumin [1 ]
Ji, Yanli [1 ]
Gao, Lianli [1 ]
Shen, Heng Tao [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph neural networks (GNNs) have been widely used in the 3D human pose estimation task, since the pose representation of a human body can be naturally modeled by the graph structure. Generally, most of the existing GNN-based models utilize the restricted receptive fields of filters and single-scale information, while neglecting the valuable multiscale contextual information. To tackle this issue, we propose a novel model named Graph Transformer Encoder-Decoder with Atrous Convolution (PoseGTAC), to effectively extract multi-scale context and long-range information. Specifically, our PoseGTAC model has two key components: Graph Atrous Convolution (GAC) and Graph Transformer Layer (GTL), which are respectively for the extraction of local multi-scale and global long-range information. They are combined and stacked in an encoder-decoder structure, where graph pooling and unpooling are adopted for the interaction of multi-scale information from local to global aspect (e.g., part-scale and body-scale). Extensive experiments on the Human3.6M and MPI-INF-3DHP datasets demonstrate that the proposed PoseGTAC model achieves state-of-the-art performance.
引用
收藏
页码:1359 / 1365
页数:7
相关论文
共 50 条
  • [41] 3D Image Inpainting for Rotor Detection using 3D Encoder-Decoder Generative Adversarial Network
    Chung, Yi-Hao
    Chen, Yen-Lin
    IEEE ISPCE-ASIA 2021: IEEE INTERNATIONAL SYMPOSIUM ON PRODUCT COMPLIANCE ENGINEERING - ASIA, 2021,
  • [42] 3D Image Inpainting for Rotor Detection using 3D Encoder-Decoder Generative Adversarial Network
    Chung, Yi-Hao
    Chen, Yen-Lin
    IEEE ISPCE-ASIA 2021: IEEE INTERNATIONAL SYMPOSIUM ON PRODUCT COMPLIANCE ENGINEERING - ASIA, 2021,
  • [43] CED-Net: contextual encoder-decoder network for 3D face reconstruction
    Zhu, Lei
    Wang, Shanmin
    Zhao, Zengqun
    Xu, Xiang
    Liu, Qingshan
    MULTIMEDIA SYSTEMS, 2022, 28 (05) : 1713 - 1722
  • [44] Multimodal 3D medical image registration guided by shape encoder-decoder networks
    Blendowski, Max
    Bouteldja, Nassim
    Heinrich, Mattias P.
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2020, 15 (02) : 269 - 276
  • [45] HEDNet: A Hierarchical Encoder-Decoder Network for 3D Object Detection in Point Clouds
    Zhang, Gang
    Chen, Junnan
    Gao, Guohuan
    Li, Jianmin
    Hu, Xiaolin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] A NOVEL TWO-PATHWAY ENCODER-DECODER NETWORK FOR 3D FACE RECONSTRUCTION
    Li, Xianfeng
    Weng, Zichun
    Liang, Juntao
    Cai, Lei
    Xiang, Youjun
    Fu, Yuli
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3682 - 3686
  • [47] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
    Tien-Dat Tran
    Xuan-Thuy Vo
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
  • [48] LOCAL TO GLOBAL TRANSFORMER FOR VIDEO BASED 3D HUMAN POSE ESTIMATION
    Ma, Haifeng
    Ke Lu
    Xue, Jian
    Niu, Zehai
    Gao, Pengcheng
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [49] 3D human pose estimation with multi-hypotheses gated transformer
    Dong, Xiena
    Zhang, Jian
    Yu, Jun
    Yu, Ting
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [50] Transformer-based weakly supervised 3D human pose estimation
    Wu, Xiao-guang
    Xie, Hu-jie
    Niu, Xiao-chen
    Wang, Chen
    Wang, Ze-lei
    Zhang, Shi-wen
    Shan, Yu-ze
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 109