Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

被引:7
|
作者
Lee, Seongyeong [1 ,2 ]
Park, Hansoo [1 ]
Kim, Dong Uk [1 ]
Kim, Jihyeon [1 ]
Boboev, Muhammadjon [1 ]
Baek, Seungryul [1 ]
机构
[1] UNIST, Ulsan, South Korea
[2] NC Soft, Seongnam, South Korea
关键词
D O I
10.1109/WACV56688.2023.00295
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-based 3D hand pose estimation has been successful for decades thanks to large-scale databases and deep learning. However, the hand pose estimation network does not operate well for hand pose images whose characteristics are far different from the training data. This is caused by various factors such as illuminations, camera angles, diverse backgrounds in the input images, etc. Many existing methods tried to solve it by supplying additional large-scale unconstrained/target domain images to augment data space; however collecting such large-scale images takes a lot of labors. In this paper, we present a simple image-free domain generalization approach for the hand pose estimation framework that uses only source domain data. We try to manipulate the image features of the hand pose estimation network by adding the features from text descriptions using the CLIP (Contrastive Language-Image Pre-training) model. The manipulated image features are then exploited to train the hand pose estimation network via the contrastive learning framework. In experiments with STB and RHD datasets, our algorithm shows improved performance over the state-of-the-art domain generalization approaches.
引用
收藏
页码:2933 / 2943
页数:11
相关论文
共 50 条
  • [31] Survey on depth and RGB image-based 3D hand shape and pose estimation
    Lin HUANG
    Boshen ZHANG
    Zhilin GUO
    Yang XIAO
    Zhiguo CAO
    Junsong YUAN
    虚拟现实与智能硬件(中英文), 2021, 3 (03) : 207 - 234
  • [32] Survey on depth and RGB image-based 3D hand shape and pose estimation
    Huang L.
    Zhang B.
    Guo Z.
    Xiao Y.
    Cao Z.
    Yuan J.
    Virtual Reality and Intelligent Hardware, 2021, 3 (03): : 207 - 234
  • [33] Occlusion-Robust 3D Hand Pose Estimation from a Single RGB Image
    Ishii, Asuka
    Nakano, Gaku
    Inoshita, Tetsuo
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [34] Efficient Domain Adaptation via Generative Prior for 3D Infant Pose Estimation
    Zhou, Zhuoran
    Jiang, Zhongyu
    Chai, Wenhao
    Yang, Cheng-Yen
    Li, Lei
    Hwang, Jenq-Neng
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 51 - 59
  • [35] Hand PointNet: 3D Hand Pose Estimation using Point Sets
    Ge, Liuhao
    Cai, Yujun
    Weng, Junwu
    Yuan, Junsong
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8417 - 8426
  • [36] Fast and Accurate 3D Hand Pose Estimation via Recurrent Neural Network for Capturing Hand Articulations
    Yoo, Cheol-Hwan
    Ji, Seowon
    Shin, Yong-Goo
    Kim, Seung-Wook
    Ko, Sung-Jea
    IEEE ACCESS, 2020, 8 : 114010 - 114019
  • [37] Single image based 3D human pose estimation via uncertainty learning
    Han, Chuchu
    Yu, Xin
    Gao, Changxin
    Sang, Nong
    Yang, Yi
    PATTERN RECOGNITION, 2022, 132
  • [38] 3D Hand Pose Estimation via aligned latent space injection and kinematic losses
    Stergioulas, Andreas
    Chatzis, Theocharis
    Konstantinidis, Dimitrios
    Dimitropoulos, Kosmas
    Daras, Petros
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1730 - 1739
  • [39] 3D human pose estimation from a single image via exemplar augmentation
    Yang, Jingjing
    Wan, Lili
    Xu, Wanru
    Wang, Shenghui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 371 - 379
  • [40] Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation
    Guo, Fangtai
    He, Zaixing
    Zhang, Shuyou
    Zhao, Xinyue
    Tan, Jianrong
    IEEE ACCESS, 2020, 8 : 18258 - 18269