Temporal Action Localization With Coarse-to-Fine Network

被引:2
|
作者
Zhang, Min [1 ]
Hu, Haiyang [2 ]
Li, Zhongjin [2 ]
机构
[1] Zhejiang Ind Polytech Coll, Dept Design & Art, Shaoxing 312000, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Learning systems; Videos; Location awareness; Transformers; Feature extraction; Logic gates; Temporal action localization; action detection; action granularity; progressive learning;
D O I
10.1109/ACCESS.2022.3205594
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Precisely localizing temporal intervals for each action segment in long raw videos is essential challenge in practical video content analysis (e.g., activity detection or video caption generation). Most of previous works often neglect the hierarchical action granularity and eventually fail to identify precise action boundaries. (e.g., embracing approaching or turning a screw in mechanical maintenance). In this paper, we introduce a simple yet efficient coarse-to-fine network (CFNet) to solve the challenging issue of temporal action localization by progressively refining action boundary at multiple action granularities. The proposed CFNet is mainly composed of three components: a coarse proposal module (CPM) to generate coarse action candidates, a fusion block (FB) to enhance feature representation by fusing the coarse candidate features and corresponding features of raw input frames, and a boundary transformer module (BTM) to further refine action boundaries. Specifically, CPM exploits framewise, matching and gated actionness curves to complement each other for coarse candidate generation at different levels, while FB is devised to enrich feature representation by fusing the last feature map of CPM and corresponding raw frame input. Finally, BTM learns long-term temporal dependency with a transformer structure to further refine action boundaries at a finer granularity. Thus, the fine-grained action intervals can be incrementally obtained. Compared with previous state-of-the-art techniques, the proposed coarse-to-fine network can asymptotically approach fine-grained action boundary. Comprehensive experiments are conducted on both publicly available THUMOS14 and ActivityNet-v1.3 datasets, and show the outstanding improvements of our method when compared with the prior methods on various video action parsing tasks.
引用
收藏
页码:96378 / 96387
页数:10
相关论文
共 50 条
  • [41] CFNet: A Coarse-to-Fine Network for Few Shot Semantic Segmentation
    Liu, Jiade
    Jung, Cheolkon
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [42] HafaNet: An Efficient Coarse-to-Fine Facial Landmark Detection Network
    Zheng, Shaun
    Bai, Xiuxiu
    Ye, Lele
    Fang, Zhan
    IEEE ACCESS, 2020, 8 : 123037 - 123043
  • [43] An edge guided coarse-to-fine generative network for image outpainting
    Xu, Yiwen
    Pagnucco, Maurice
    Song, Yang
    NEUROCOMPUTING, 2023, 541
  • [44] A Coarse-to-Fine Instance Segmentation Network with Learning Boundary Representation
    Luo, Feng
    Gao, Bin-Bin
    Yan, Jiangpeng
    Li, Xiu
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [45] 'Coarse-to-fine' cyclopean processing
    Popple, AV
    Findlay, JM
    PERCEPTION, 1999, 28 (02) : 155 - 165
  • [46] Coarse-to-fine face detection
    Fleuret, F
    Geman, D
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2001, 41 (1-2) : 85 - 107
  • [47] Coarse-to-fine manifold learning
    Castro, R
    Willett, R
    Nowak, R
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 992 - 995
  • [48] Accurate Edge Localization of Complex Workpiece Based on Coarse-to-fine Principle
    Zhang, Panjie
    Wang, Hongyan
    Cheng, Wei
    Li, Jinping
    2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
  • [49] Coarse-to-Fine Grained Classification
    Huo, Yuqi
    Lu, Yao
    Niu, Yulei
    Lu, Zhiwu
    Wen, Ji-Rong
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 1033 - 1036
  • [50] Coarse-to-fine dynamic programming
    Raphael, C
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (12) : 1379 - 1390