Multi-Scale Contrastive Learning for Human Pose Estimation

被引:0
|
作者
Bao, Wenxia [1 ]
Lin, An [1 ]
Huang, Hua [1 ]
Yang, Xianjun [1 ]
Chen, Hemu [1 ]
机构
[1] Anhui Univ, Sch Elect & Informat Engn, Hefei 230601, Anhui, Peoples R China
关键词
human pose estimation; contrastive learning; multi-scale fea-; ture; feature pyramid network;
D O I
10.1587/transinf.2024EDP7048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent years have seen remarkable progress in human pose estimation. However, manual annotation of keypoints remains tedious and imprecise. To alleviate this problem, this paper proposes a novel method called Multi-Scale Contrastive Learning (MSCL). This method uses a siamese network structure with upper and lower branches that capture diffirent views of the same image. Each branch uses a backbone network to extract image representations, employing multi-scale feature vectors to capture information. These feature vectors are then passed through an enhanced feature pyramid for fusion, producing more robust feature representations. The feature vectors are then further encoded by mapping and prediction heads to predict the feature vector of another view. Using negative cosine similarity between vectors as a loss function, the backbone network is pre-trained on a large-scale unlabeled dataset, enhancing its capacity to extract visual representations. Finally, transfer learning is performed on a small amount of labelled data for the pose estimation task. Experiments on COCO datasets show significant improvements in Average Precision (AP) of 1.8%, 0.9%, and 1.2% with 1%, 5%, and 10% labelled data on COCO. In addition, the Percentage of Correct Keypoints (PCK) improves by 0.5% on MPII&AIC, outperforming mainstream contrastive learning methods.
引用
收藏
页码:1332 / 1341
页数:10
相关论文
共 50 条
  • [1] Selective Learning of Human Pose Estimation Based on Multi-Scale Convergence Network
    Liu, Wenkai
    Qin, Cuizhu
    Wu, Menglong
    Bai, Wenle
    Dong, Hongxia
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1081 - 1084
  • [2] Multi-Scale Collaborative Network for Human Pose Estimation
    Guo, Chunsheng
    Zhou, Jialuo
    Du, Wenlong
    Zhang, Xuguang
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2019, 16 (04)
  • [3] MULTI-SCALE SUPERVISED NETWORK FOR HUMAN POSE ESTIMATION
    Ke, Lipeng
    Chang, Ming-Ching
    Qi, Honggang
    Lyu, Siwei
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 564 - 568
  • [4] MfvPose: A multi-scale hybrid framework for human pose estimation
    Ran, Lang
    Hong, Chaoqun
    Zhang, Xuebai
    Tang, Chaohui
    Xie, Yuhong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 10769 - 10778
  • [5] Multi-Scale Feature Refined Network for Human Pose Estimation
    Yang, Qiaoning
    Ji, Xiaodong
    Yang, Xiuhui
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
  • [6] Multi-Scale Subgraph Contrastive Learning
    Liu, Yanbei
    Zhao, Yu
    Wang, Xiao
    Geng, Lei
    Xiao, Zhitao
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2215 - 2223
  • [7] Unsupervised Learning of Depth Estimation and Camera Pose With Multi-Scale GANs
    Xu, Yufan
    Wang, Yan
    Huang, Rui
    Lei, Zeyu
    Yang, Junyao
    Li, Zijian
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 17039 - 17047
  • [8] MSTPose: Learning-Enriched Visual Information with Multi-Scale Transformers for Human Pose Estimation
    Wu, Chengyu
    Wei, Xin
    Li, Shaohua
    Zhan, Ao
    ELECTRONICS, 2023, 12 (15)
  • [9] Human Pose Estimation with Multi-Scale and Multi-Level Feature Fusion
    Wang, Yanni
    Hu, Min
    Han, Shipeng
    Chen, Yixuan
    Lyu, Hao
    Computer Engineering and Applications, 2025, 61 (06) : 199 - 209
  • [10] Hand pose estimation with multi-scale network
    Zhongxu Hu
    Youmin Hu
    Bo Wu
    Jie Liu
    Dongmin Han
    Thomas Kurfess
    Applied Intelligence, 2018, 48 : 2501 - 2515