A hybrid network for estimating 3D interacting hand pose from a single RGB image

被引:0
|
作者
Bao, Wenxia [1 ]
Gao, Qiuyue [1 ]
Yang, Xianjun [2 ]
机构
[1] Anhui Univ, Sch Elect & Informat Engn, Hefei 230601, Anhui, Peoples R China
[2] Chinese Acad Sci, Hefei Inst Phys Sci, Hefei 230031, Anhui, Peoples R China
关键词
3D hand pose estimation; Interacting Hand; Hybrid network; End to end network; TEXT; RECOGNITION; KHATT;
D O I
10.1007/s11760-024-03043-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The estimation of 3D interacting hand pose from a single RGB image is a challenging problem. The hands tend to occlude each other and are self-similar in two-handed interactions. In this study, a simple, accurate end-to-end framework called HybridPoseNet is proposed for estimating 3D interactive hand pose. The hybrid network employs an encoder-decoder architecture. More specifically, the feature encoder is a hybrid structure that combines a convolutional neural network (CNN) with a transformer to accomplish the feature encoding of hand information. An ordinary CNN is employed to extract the local detailed features of a given image, and a vision transformer is used to capture the long-distance spatial interactions between the cross-positional feature vectors. Moreover, the 3D pose decoder is based on left and right network branches, which are fused via a feature enhancement module (FEM). The FEM helps reduce the ambiguity in appearance caused by the self-similarity of the hands. The decoder elevates the 2D pose to the 3D pose by estimating two depth components. The ablation experiments demonstrate the effectiveness of each module in the network. In addition, comprehensive experiments on the InterHand2.6M dataset show that the proposed method outperforms previous state-of-the-art methods for estimating interactive hand pose.
引用
收藏
页码:3801 / 3814
页数:14
相关论文
共 50 条
  • [21] 3D hand pose retrieval from a single 2D image
    Guan, HY
    Chua, CS
    Ho, YK
    2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2001, : 157 - 160
  • [22] Variational Object-Aware 3-D Hand Pose From a Single RGB Image
    Gao, Yafei
    Wang, Yida
    Falco, Pietro
    Navab, Nassir
    Tombari, Federico
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (04): : 4239 - 4246
  • [23] HandFormer: Hand pose reconstructing from a single RGB image
    Jiao, Zixun
    Wang, Xihan
    Li, Jingcao
    Gao, Rongxin
    He, Miao
    Liang, Jiao
    Xia, Zhaoqiang
    Gao, Quanli
    PATTERN RECOGNITION LETTERS, 2024, 183 : 155 - 164
  • [24] A graph-based approach for absolute 3D hand pose estimation using a single RGB image
    Ikram Kourbane
    Yakup Genc
    Applied Intelligence, 2022, 52 : 16667 - 16682
  • [25] A graph-based approach for absolute 3D hand pose estimation using a single RGB image
    Kourbane, Ikram
    Genc, Yakup
    APPLIED INTELLIGENCE, 2022, 52 (14) : 16667 - 16682
  • [26] Estimating 3D human shape under clothing from a single RGB image
    Shigeki Y.
    Okura F.
    Mitsugami I.
    Yagi Y.
    IPSJ Transactions on Computer Vision and Applications, 2018, 10 (01)
  • [27] PressureVision: Estimating Hand Pressure from a Single RGB Image
    Grady, Patrick
    Tang, Chengcheng
    Brahmbhatt, Samarth
    Twigg, Christopher D.
    Wan, Chengde
    Hays, James
    Kemp, Charles C.
    COMPUTER VISION - ECCV 2022, PT VI, 2022, 13666 : 328 - 345
  • [28] Survey on depth and RGB image-based 3D hand shape and pose estimation
    Lin HUANG
    Boshen ZHANG
    Zhilin GUO
    Yang XIAO
    Zhiguo CAO
    Junsong YUAN
    虚拟现实与智能硬件(中英文), 2021, 3 (03) : 207 - 234
  • [29] Survey on depth and RGB image-based 3D hand shape and pose estimation
    Huang L.
    Zhang B.
    Guo Z.
    Xiao Y.
    Cao Z.
    Yuan J.
    Virtual Reality and Intelligent Hardware, 2021, 3 (03): : 207 - 234
  • [30] Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
    Xu, Chi
    Zhou, Jun
    Cai, Wendi
    Jiang, Yunkai
    Li, Yongbo
    Liu, Yi
    SENSORS, 2020, 20 (21) : 1 - 22