Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization

被引:8
|
作者
Zhou, Jianxiong [1 ]
Wu, Ying [1 ]
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60208 USA
关键词
D O I
10.1109/WACV56688.2023.00597
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly-supervised Temporal Action Localization (WTAL) aims to classify and localize action instances in untrimmed videos with only video-level labels. Existing methods typically use snippet-level RGB and optical flow features extracted from pre-trained extractors directly. Because of two limitations: the short temporal span of snippets and the inappropriate initial features, these WTAL methods suffer from the lack of effective use of temporal information and have limited performance. In this paper, we propose the Temporal Feature Enhancement Dilated Convolution Network (TFE-DCN) to address these two limitations. The proposed TFE-DCN has an enlarged receptive field that covers a long temporal span to observe the full dynamics of action instances, which makes it powerful to capture temporal dependencies between snippets. Furthermore, we propose the Modality Enhancement Module that can enhance RGB features with the help of enhanced optical flow features, making the overall features appropriate for the WTAL task. Experiments conducted on THUMOS'14 and ActivityNet v1.3 datasets show that our proposed approach far outperforms state-of-the-art WTAL methods.
引用
收藏
页码:6017 / 6026
页数:10
相关论文
共 50 条
  • [41] Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization
    Zhang, Chengwei
    Xu, Yunlu
    Cheng, Zhanzhan
    Niu, Yi
    Pu, Shiliang
    Wu, Fei
    Zou, Futai
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 738 - 746
  • [42] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
    Chen, Mengyuan
    Gao, Junyu
    Yang, Shicai
    Xu, Changsheng
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 192 - 208
  • [43] Learning Background Suppression Model for Weakly-supervised Temporal Action Localization
    Liu, Mengxue
    Gao, Xiangjun
    Ge, Fangzhen
    Liu, Huaiyu
    Li, Wenjing
    IAENG International Journal of Computer Science, 2021, 48 (04):
  • [44] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
    Liu, Qinying
    Wang, Zilei
    Chen, Ruoxi
    Li, Zhilin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1032 - 1037
  • [45] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
    Liu, Qinying
    Wang, Zilei
    Chen, Ruoxi
    Li, Zhilin
    Proceedings - IEEE International Conference on Multimedia and Expo, 2023, 2023-July : 1032 - 1037
  • [46] ACTION COHERENCE NETWORK FOR WEAKLY SUPERVISED TEMPORAL ACTION LOCALIZATION
    Zhai, Yuanhao
    Wang, Le
    Liu, Ziyi
    Zhang, Qilin
    Hua, Gang
    Zheng, Nanning
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3696 - 3700
  • [47] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
    Bi, Mingwen
    Li, Jiaqi
    Liu, Xinliang
    Zhang, Qingchuan
    Yang, Zhenghong
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4307 - 4324
  • [48] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
    Mingwen Bi
    Jiaqi Li
    Xinliang Liu
    Qingchuan Zhang
    Zhenghong Yang
    Neural Processing Letters, 2023, 55 : 4307 - 4324
  • [49] TSCANet: a two-stream context aggregation network for weakly-supervised temporal action localization
    Zhang, Haiping
    Lin, Haixiang
    Wang, Dongjing
    Xu, Dongyang
    Zhou, Fuxing
    Guan, Liming
    Yu, Dongjing
    Fang, Xujian
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [50] A Snippets Relation and Hard-Snippets Mask Network for Weakly-Supervised Temporal Action Localization
    Zhao, Yibo
    Zhang, Hua
    Gao, Zan
    Guan, Weili
    Wang, Meng
    Chen, Shengyong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7202 - 7215