A Multiscale Coarse-to-Fine Human Pose Estimation Network With Hard Keypoint Mining

被引:2
|
作者
Jiang, Xiaoyan [1 ]
Tao, Hangyu [1 ]
Hwang, Jenq-Neng [2 ]
Fang, Zhijun [3 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[3] Donghua Univ, Sch Comp Sci & Technol, Shanghai 200051, Peoples R China
基金
中国国家自然科学基金;
关键词
Pose estimation; Standards; Convolution; Training; Task analysis; Heating systems; Detectors; Hard sample mining; human pose estimation; multiscale;
D O I
10.1109/TSMC.2023.3328876
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current convolution neural network (CNN)-based multiperson pose estimators have achieved great progress, however, they pay no or less attention to "hard" samples, such as occluded keypoints, small and nearly invisible keypoints, and ambiguous keypoints. In this article, we explicitly deal with these "hard" samples by proposing a novel multiscale coarse-to-fine human pose estimation network ((HMPN)-P-2), which includes two sequential subnetworks: CoarseNet and FineNet. CoarseNet conducts a coarse prediction to locate "simple" keypoints like hands and ankles with a multiscale fusion module, which is integrated with bottleneck, resulting in a novel module called multiscale bottleneck. The new module improves the multiscale representation ability of the network in a fine-grained level, while marginally reducing the computation cost because of group convolution. FineNet further infers "hard" keypoints and refines "simple" keypoints simultaneously with a hard keypoint mining loss. Distinct from the previous works, the proposed loss deals with "hard" keypoints differentially and prevents "simple" keypoints from dominating the computed gradients during training. Experiments on the COCO keypoint benchmark show that our approach achieves superior pose estimation performance compared with other state-of-the-art methods. Source code is available for further research: https://github.com/sues-vision/C2F-HumanPoseEstimation.
引用
收藏
页码:1730 / 1741
页数:12
相关论文
共 50 条
  • [21] Iterative Coarse-to-Fine 6D-Pose Estimation Using Back-propagation
    Araki, Ryosuke
    Mano, Kohsuke
    Hirano, Tadanori
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3587 - 3594
  • [22] Coarse-to-Fine Multi-Scene Pose Regression With Transformers
    Shavit, Yoli
    Ferens, Ron
    Keller, Yosi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14222 - 14233
  • [23] A coarse-to-fine IP-driven registration for pose estimation from single ultrasound image
    Zheng, Bo
    Ishikawa, Ryo
    Takamatsu, Jun
    Oishi, Takeshi
    Ikeuchi, Katsushi
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (12) : 1647 - 1658
  • [24] A Coarse-to-Fine Model for 3D Pose Estimation and Sub-category Recognition
    Mottaghi, Roozbeh
    Xiang, Yu
    Savarese, Silvio
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 418 - 426
  • [25] Face Alignment by Coarse-to-Fine Shape Estimation
    Wan Jun
    Li Jing
    Chang Jun
    Wu Yujia
    Xiao Yafu
    Song Chengfang
    CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (06) : 1183 - 1191
  • [26] Face Alignment by Coarse-to-Fine Shape Estimation
    WAN Jun
    LI Jing
    CHANG Jun
    WU Yujia
    XIAO Yafu
    SONG Chengfang
    ChineseJournalofElectronics, 2018, 27 (06) : 1183 - 1191
  • [27] Coarse-to-Fine Feature Mining for Video Semantic Segmentation
    Sun, Guolei
    Liu, Yun
    Ding, Henghui
    Probst, Thomas
    Van Gool, Luc
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3116 - 3127
  • [28] Coarse-to-fine multiscale affine invariant shape matching and classification
    El Rube, IA
    Ahmed, M
    Kamel, M
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 163 - 166
  • [29] Temporal Action Localization With Coarse-to-Fine Network
    Zhejiang Industry Polytechnic College, Department of Design and Art, Shaoxing
    312000, China
    不详
    310018, China
    IEEE Access, 2022, (96378-96387)
  • [30] Temporal Action Localization With Coarse-to-Fine Network
    Zhang, Min
    Hu, Haiyang
    Li, Zhongjin
    IEEE ACCESS, 2022, 10 : 96378 - 96387