A Multiscale Coarse-to-Fine Human Pose Estimation Network With Hard Keypoint Mining

被引:2
|
作者
Jiang, Xiaoyan [1 ]
Tao, Hangyu [1 ]
Hwang, Jenq-Neng [2 ]
Fang, Zhijun [3 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[3] Donghua Univ, Sch Comp Sci & Technol, Shanghai 200051, Peoples R China
基金
中国国家自然科学基金;
关键词
Pose estimation; Standards; Convolution; Training; Task analysis; Heating systems; Detectors; Hard sample mining; human pose estimation; multiscale;
D O I
10.1109/TSMC.2023.3328876
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current convolution neural network (CNN)-based multiperson pose estimators have achieved great progress, however, they pay no or less attention to "hard" samples, such as occluded keypoints, small and nearly invisible keypoints, and ambiguous keypoints. In this article, we explicitly deal with these "hard" samples by proposing a novel multiscale coarse-to-fine human pose estimation network ((HMPN)-P-2), which includes two sequential subnetworks: CoarseNet and FineNet. CoarseNet conducts a coarse prediction to locate "simple" keypoints like hands and ankles with a multiscale fusion module, which is integrated with bottleneck, resulting in a novel module called multiscale bottleneck. The new module improves the multiscale representation ability of the network in a fine-grained level, while marginally reducing the computation cost because of group convolution. FineNet further infers "hard" keypoints and refines "simple" keypoints simultaneously with a hard keypoint mining loss. Distinct from the previous works, the proposed loss deals with "hard" keypoints differentially and prevents "simple" keypoints from dominating the computed gradients during training. Experiments on the COCO keypoint benchmark show that our approach achieves superior pose estimation performance compared with other state-of-the-art methods. Source code is available for further research: https://github.com/sues-vision/C2F-HumanPoseEstimation.
引用
收藏
页码:1730 / 1741
页数:12
相关论文
共 50 条
  • [31] HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-Fine Pose-Reversible Guidance
    Fang, Guian
    Yan, Wenbiao
    Guo, Yuanfan
    Han, Jianhua
    Jiang, Zutao
    Xu, Hang
    Liao, Shengcai
    Liang, Xiaodan
    COMPUTER VISION - ECCV 2024, PT XXXII, 2025, 15090 : 201 - 217
  • [32] Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose
    Pavlakos, Georgios
    Zhou, Xiaowei
    Derpanis, Konstantinos G.
    Daniilidis, Kostas
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1263 - 1272
  • [33] A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening
    Wang, Xinming
    Yang, Zhihao
    Zhang, Hanlin
    Liu, Zuode
    Ren, Weihong
    Xu, Xiu
    Xu, Qiong
    Liu, Honghai
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT I, 2022, 13455 : 445 - 454
  • [34] A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method
    Ren, Yuzhuo
    Li, Shangwen
    Chen, Chen
    Kuo, C. -C. Jay
    COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 36 - 51
  • [35] Coarse-to-fine information integration in human vision
    Petras, Kirsten
    ten Oever, Sanne
    Jacobs, Christianne
    Goffaux, Valerie
    NEUROIMAGE, 2019, 186 : 103 - 112
  • [36] A novel coarse-to-fine search algorithm for motion estimation
    Korah, Reeba
    2006 IEEE International Conference on Industrial Technology, Vols 1-6, 2006, : 1488 - 1493
  • [37] Coarse-to-fine event model for human activities
    Cuntoor, Naresh P.
    Chellappa, Rama
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 813 - 816
  • [38] Coarse-to-Fine Homography Estimation for Infrared and Visible Images
    Wang, Xingyi
    Luo, Yinhui
    Fu, Qiang
    He, Yuanqing
    Shu, Chang
    Wu, Yuezhou
    Liao, Yanhao
    ELECTRONICS, 2023, 12 (21)
  • [39] COARSE-TO-FINE PYRAMID FEATURE MINING FOR WHEAT HEAD DETECTION
    Harada, Sho
    Han, Xian-Hua
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1350 - 1354
  • [40] DPDFormer: A Coarse-to-Fine Model for Monocular Depth Estimation
    Liu, Chunpu
    Yang, Guanglei
    Zuo, Wangmeng
    Zang, Tianyi
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)