LCFFNet: A Lightweight Cross-scale Feature Fusion Network for human pose estimation

被引:0
|
作者
Zou, Xuelian [1 ]
Bi, Xiaojun [2 ,3 ]
机构
[1] Harbin Engn Univ, Coll Informat & Commun Engn, Harbin, Heilongjiang, Peoples R China
[2] Minzu Univ China, Key Lab Ethn Language Intelligent Anal & Secur Gov, Beijing, Peoples R China
[3] Minzu Univ China, Sch Informat Engn, Beijing, Peoples R China
关键词
Human pose estimation; 2d dynamic multi-scale convolution; Contextual semantic information; Adaptive feature fusion;
D O I
10.1016/j.neunet.2024.106959
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human pose estimation is one of the most critical and challenging problems in computer vision. It is applied in many computer vision fields and has important research significance. However, it is still a difficult challenge to strike a balance between the number of parameters and computing load of the model and the accuracy of human pose estimation. In this study, we suggest a Lightweight Cross-scale Feature Fusion Network (LCFFNet) to strike a balance between accuracy and computational load and parameter volume. The Lightweight HRNet-Like (LHRNet) network, Cross-Resolution-Aware Semantics Module (CRASM), and Adapt Feature Fusion Module (AFFM) makeup LCFFNet. To be more precise, first, we suggest a lightweight LHRNet network that includes Dynamic Multi-scale Convolution Basic (DMSC-Basic block) block, Basic block, and DMSC-Basic block submodules in the network's three high-resolution subnetwork stages. The proposed dynamic multi-scale convolution in DMSC-Basic block can reduces the amount of model parameters and complexity of the LHRNet network, and has the ability to extract variable pose features. In order to maintain the model's ability to express features, the Basic block is introduced. Asa result, the LHRNet network not only makes the model more lightweight but also enhances its feature expression capabilities. Second, we propose a CRASM module to enhance contextual semantic information while reducing the semantic gap between different scales by fusing features from different scales. Finally, the augmented semantic feature map's spatial resolution is finally restored from bottom to top using our suggested AFFM, and adaptive feature fusion is used to increase the positioning accuracy of important sites. Our method successfully predicts keypoints with 74.2 % AP, 89.9 % PCKh@0.5 and 66.9 % AP on the MSCOCO 2017, MPII and Crowdpose datasets, respectively. Our model reduces the number of parameters by 89.0 % and the computational complexity by 87.5 % compared with HRNet. The proposed network performs as well as current large-model human pose estimation networks while outperforming state-of the-art lightweight networks.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Repeated Cross-Scale Structure-Induced Feature Fusion Network for 2D Hand Pose Estimation
    Guan, Xin
    Shen, Huan
    Nyatega, Charles Okanda
    Li, Qiang
    ENTROPY, 2023, 25 (05)
  • [2] Lightweight Cross-Fusion Network on Human Pose Estimation for Edge Device
    Zhu, Xian
    Zeng, Xiaoqin
    Ma, Wei
    IEEE ACCESS, 2023, 11 : 134899 - 134907
  • [3] CSFFNet: Lightweight cross-scale feature fusion network for salient object detection in remote sensing images
    Wang, Longbao
    Long, Chong
    Li, Xin
    Tang, Xiaodan
    Bai, Zhipeng
    Gao, Hongmin
    IET IMAGE PROCESSING, 2024, 18 (03) : 602 - 614
  • [4] Development of a lightweight cross-scale decoupling feature fusion network for surface defect detection in permanent magnets
    Shu, Shuangbao
    Gao, Xinyu
    Zheng, Changjie
    Fu, Yufeng
    Wang, Jiyao
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)
  • [5] LFSimCC: Spatial fusion lightweight network for human pose estimation
    Zheng, Qian
    Guo, Hualing
    Yin, Yunhua
    Zheng, Bin
    Jiang, Hongxu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99
  • [6] LCFF-Net: A lightweight cross-scale feature fusion network for tiny target detection in UAV aerial imagery
    Tang, Daoze
    Tang, Shuyun
    Fan, Zhipeng
    PLOS ONE, 2024, 19 (12):
  • [7] Ghost shuffle lightweight pose network with effective feature representation and learning for human pose estimation
    Yang, Senquan
    Wen, Jiajun
    Fan, Junjun
    IET COMPUTER VISION, 2022, 16 (06) : 525 - 540
  • [8] CFFDist: Cross-Scale Feature Fusion Distillation Network for Industrial Anomaly Localization
    Zhi, Hui
    Qin, Hao
    Zhang, Lanning
    Guo, Jie
    Song, Bin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [9] An Improved YOLOv8-Based Lightweight Attention Mechanism for Cross-Scale Feature Fusion
    Liu, Shaodong
    Shao, Faming
    Chu, Weijun
    Dai, Juying
    Zhang, Heng
    REMOTE SENSING, 2025, 17 (06)
  • [10] Human pose estimation based on cross-view feature fusion
    Sun, Dandan
    Wang, Siqi
    Xia, Hailun
    Zhang, Changan
    Gao, Jianlong
    Mao, Mingyu
    VISUAL COMPUTER, 2024, 40 (09): : 6581 - 6597