Weakly Supervised Action Localization by Sparse Temporal Pooling Network

被引:249
|
作者
Phuc Nguyen [1 ]
Liu, Ting [2 ]
Prasad, Gautam [2 ]
Han, Bohyung [3 ]
机构
[1] Univ Calif Irvine, Irvine, CA 92697 USA
[2] Google, Venice, CA USA
[3] Seoul Natl Univ, Seoul, South Korea
关键词
D O I
10.1109/CVPR.2018.00706
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks. Our algorithm learns from video-level class labels and predicts temporal intervals of human actions with no requirement of temporal localization annotations. We design our network to identify a sparse subset of key segments associated with target actions in a video using an attention module and fuse the key segments through adaptive temporal pooling. Our loss function is comprised of two terms that minimize the video-level action classification error and enforce the sparsity of the segment selection. At inference time, we extract and score temporal proposals using temporal class activations and class-agnostic attentions to estimate the time intervals that correspond to target actions. The proposed algorithm attains state-of-the-art results on the THUMOS14 dataset and outstanding performance on ActivityNet1.3 even with its weak supervision.
引用
收藏
页码:6752 / 6761
页数:10
相关论文
共 50 条
  • [21] ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization
    Yang, Zichen
    Qin, Jie
    Huang, Di
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3090 - 3098
  • [22] Collaborative Foreground, Background, and Action Modeling Network for Weakly Supervised Temporal Action Localization
    Moniruzzaman, Md.
    Yin, Zhaozheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6939 - 6951
  • [23] Deep cascaded action attention network for weakly-supervised temporal action localization
    Xia, Hui-fen
    Zhan, Yong-zhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29769 - 29787
  • [24] ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization
    Liu, Ziyi
    Wang, Le
    Zhang, Qilin
    Tang, Wei
    Yuan, Junsong
    Zheng, Nanning
    Hua, Gang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2233 - 2241
  • [25] Deep cascaded action attention network for weakly-supervised temporal action localization
    Hui-fen Xia
    Yong-zhao Zhan
    Multimedia Tools and Applications, 2023, 82 : 29769 - 29787
  • [26] Weakly Supervised Temporal Action Localization by Multi-Stage Fusion Network
    Shen, Zhengyang
    Wang, Feng
    Dai, Jin
    IEEE ACCESS, 2020, 8 : 17287 - 17298
  • [27] Deep feature enhancing and selecting network for weakly supervised temporal action localization
    Yu, Jiaruo
    Ge, Yongxin
    Qin, Xiaolei
    Li, Ziqiang
    Huang, Sheng
    Chen, Feiyu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 80
  • [28] Progressive enhancement network with pseudo labels for weakly supervised temporal action localization
    Wang, Qingyun
    Song, Yan
    Zou, Rong
    Shu, Xiangbo
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [29] Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network
    Ren, Hao
    Ran, Wu
    Liu, Xingson
    Ren, Haoran
    Lu, Hong
    Zhang, Rui
    Jin, Cheng
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1008 - 1013
  • [30] Weakly-supervised temporal action localization: a survey
    AbdulRahman Baraka
    Mohd Halim Mohd Noor
    Neural Computing and Applications, 2022, 34 : 8479 - 8499