Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization

被引：8

作者：

Zhou, Jianxiong ^{[1
]}

Wu, Ying ^{[1
]}

机构：

[1] Northwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60208 USA

来源：

2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年

关键词：

D O I：

10.1109/WACV56688.2023.00597

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly-supervised Temporal Action Localization (WTAL) aims to classify and localize action instances in untrimmed videos with only video-level labels. Existing methods typically use snippet-level RGB and optical flow features extracted from pre-trained extractors directly. Because of two limitations: the short temporal span of snippets and the inappropriate initial features, these WTAL methods suffer from the lack of effective use of temporal information and have limited performance. In this paper, we propose the Temporal Feature Enhancement Dilated Convolution Network (TFE-DCN) to address these two limitations. The proposed TFE-DCN has an enlarged receptive field that covers a long temporal span to observe the full dynamics of action instances, which makes it powerful to capture temporal dependencies between snippets. Furthermore, we propose the Modality Enhancement Module that can enhance RGB features with the help of enhanced optical flow features, making the overall features appropriate for the WTAL task. Experiments conducted on THUMOS'14 and ActivityNet v1.3 datasets show that our proposed approach far outperforms state-of-the-art WTAL methods.

引用

页码：6017 / 6026

页数：10

共 50 条

[41] Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization
Zhang, Chengwei
Xu, Yunlu
Cheng, Zhanzhan
Niu, Yi
Pu, Shiliang
Wu, Fei
Zou, Futai
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 738 - 746
[42] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
Chen, Mengyuan
Gao, Junyu
Yang, Shicai
Xu, Changsheng
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 192 - 208
[43] Learning Background Suppression Model for Weakly-supervised Temporal Action Localization
Liu, Mengxue
Gao, Xiangjun
Ge, Fangzhen
Liu, Huaiyu
Li, Wenjing
IAENG International Journal of Computer Science, 2021, 48 (04):
[44] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
Liu, Qinying
Wang, Zilei
Chen, Ruoxi
Li, Zhilin
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1032 - 1037
[45] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
Liu, Qinying
Wang, Zilei
Chen, Ruoxi
Li, Zhilin
Proceedings - IEEE International Conference on Multimedia and Expo, 2023, 2023-July : 1032 - 1037
[46] ACTION COHERENCE NETWORK FOR WEAKLY SUPERVISED TEMPORAL ACTION LOCALIZATION
Zhai, Yuanhao
Wang, Le
Liu, Ziyi
Zhang, Qilin
Hua, Gang
Zheng, Nanning
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3696 - 3700
[47] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
Bi, Mingwen
Li, Jiaqi
Liu, Xinliang
Zhang, Qingchuan
Yang, Zhenghong
NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4307 - 4324
[48] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
Mingwen Bi
Jiaqi Li
Xinliang Liu
Qingchuan Zhang
Zhenghong Yang
Neural Processing Letters, 2023, 55 : 4307 - 4324
[49] TSCANet: a two-stream context aggregation network for weakly-supervised temporal action localization
Zhang, Haiping
Lin, Haixiang
Wang, Dongjing
Xu, Dongyang
Zhou, Fuxing
Guan, Liming
Yu, Dongjing
Fang, Xujian
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[50] A Snippets Relation and Hard-Snippets Mask Network for Weakly-Supervised Temporal Action Localization
Zhao, Yibo
Zhang, Hua
Gao, Zan
Guan, Weili
Wang, Meng
Chen, Shengyong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7202 - 7215

← 1 2 3 4 5 →