A Multiscale Coarse-to-Fine Human Pose Estimation Network With Hard Keypoint Mining

被引:2
|
作者
Jiang, Xiaoyan [1 ]
Tao, Hangyu [1 ]
Hwang, Jenq-Neng [2 ]
Fang, Zhijun [3 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[3] Donghua Univ, Sch Comp Sci & Technol, Shanghai 200051, Peoples R China
基金
中国国家自然科学基金;
关键词
Pose estimation; Standards; Convolution; Training; Task analysis; Heating systems; Detectors; Hard sample mining; human pose estimation; multiscale;
D O I
10.1109/TSMC.2023.3328876
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current convolution neural network (CNN)-based multiperson pose estimators have achieved great progress, however, they pay no or less attention to "hard" samples, such as occluded keypoints, small and nearly invisible keypoints, and ambiguous keypoints. In this article, we explicitly deal with these "hard" samples by proposing a novel multiscale coarse-to-fine human pose estimation network ((HMPN)-P-2), which includes two sequential subnetworks: CoarseNet and FineNet. CoarseNet conducts a coarse prediction to locate "simple" keypoints like hands and ankles with a multiscale fusion module, which is integrated with bottleneck, resulting in a novel module called multiscale bottleneck. The new module improves the multiscale representation ability of the network in a fine-grained level, while marginally reducing the computation cost because of group convolution. FineNet further infers "hard" keypoints and refines "simple" keypoints simultaneously with a hard keypoint mining loss. Distinct from the previous works, the proposed loss deals with "hard" keypoints differentially and prevents "simple" keypoints from dominating the computed gradients during training. Experiments on the COCO keypoint benchmark show that our approach achieves superior pose estimation performance compared with other state-of-the-art methods. Source code is available for further research: https://github.com/sues-vision/C2F-HumanPoseEstimation.
引用
收藏
页码:1730 / 1741
页数:12
相关论文
共 50 条
  • [1] Coarse-to-Fine 3D Human Pose Estimation
    Guo, Yu
    Zhao, Lin
    Zhang, Shanshan
    Yang, Jian
    IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 579 - 592
  • [2] Coarse-to-fine Animal Pose and Shape Estimation
    Li, Chen
    Lee, Gim Hee
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Coarse-to-fine animal pose and shape estimation
    Li, Chen
    Lee, Gim Hee
    arXiv, 2021,
  • [4] A deep Coarse-to-Fine network for head pose estimation from synthetic data
    Wang, Yujia
    Liang, Wei
    Shen, Jianbing
    Jia, Yunde
    Yu, Lap-Fai
    PATTERN RECOGNITION, 2019, 94 : 196 - 206
  • [5] Efficient Monocular Coarse-to-Fine Object Pose Estimation
    Feng, Rong
    Zhang, Hong
    2016 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, 2016, : 1617 - 1622
  • [6] Coarse-to-fine multiscale fusion network for single image deraining
    Zhang, Jiahao
    Zhang, Juan
    Wu, Xing
    Shi, Zhicai
    Hwang, Jenq-Neng
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
  • [7] Coarse-to-Fine Hand-Object Pose Estimation with Interaction-Aware Graph Convolutional Network
    Zhang, Maomao
    Li, Ao
    Liu, Honglei
    Wang, Minghui
    SENSORS, 2021, 21 (23)
  • [8] Coarse-to-Fine Granularity in MultiScale FeatureFusion Network for SAR Ship Classification
    Lin, Wei
    Zheng, Hao
    Hu, Zhigang
    Zheng, Meiguang
    Yang, Liu
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT II, 2024, 15017 : 31 - 45
  • [9] Multiscale Coarse-to-Fine Guided Screenshot Demoireing
    Nguyen, Duong Hai
    Lee, Se-Ho
    Lee, Chul
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 898 - 902
  • [10] Coarse-to-Fine Multi-camera Network Topology Estimation
    Xing, Chang
    Bai, Sichen
    Zhou, Yi
    Zhou, Zhong
    Wu, Wei
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 981 - 990