Diffusion-based framework for weakly-supervised temporal action localization

被引:0
|
作者
Zou, Yuanbing [1 ]
Zhao, Qingjie [1 ]
Sarker, Prodip Kumar [1 ]
Li, Shanshan [1 ]
Wang, Lei [2 ]
Liu, Wangwang [2 ]
机构
[1] School of Computer Science and Technology, Beijing Institute of Technology, Beijing,100081, China
[2] Beijing Institute of Control Engineering, Beijing,100190, China
关键词
Adversarial machine learning - Contrastive Learning - Federated learning - Semantics - Semi-supervised learning;
D O I
10.1016/j.patcog.2024.111207
中图分类号
学科分类号
摘要
Weakly supervised temporal action localization aims to localize action instances with only video-level supervision. Due to the absence of frame-level annotation supervision, how effectively separate action snippets and backgrounds from semantically ambiguous features becomes an arduous challenge for this task. To address this issue from a generative modeling perspective, we propose a novel diffusion-based network with two stages. Firstly, we design a local masking mechanism module to learn the local semantic information and generate binary masks at the early stage, which (1) are used to perform action-background separation and (2) serve as pseudo-ground truth required by the diffusion module. Then, we propose a diffusion module to generate high-quality action predictions under the pseudo-ground truth supervision in the second stage. In addition, we further optimize the new-refining operation in the local masking module to improve the operation efficiency. The experimental results demonstrate that the proposed method achieves a promising performance on the publicly available mainstream datasets THUMOS14 and ActivityNet. The code is available at https://github.com/Rlab123/action_diff. © 2024
引用
收藏
相关论文
共 50 条
  • [1] Weakly-supervised temporal action localization: a survey
    AbdulRahman Baraka
    Mohd Halim Mohd Noor
    Neural Computing and Applications, 2022, 34 : 8479 - 8499
  • [2] Weakly-supervised temporal action localization: a survey
    Baraka, AbdulRahman
    Noor, Mohd Halim Mohd
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (11): : 8479 - 8499
  • [3] Temporal RPN Learning for Weakly-Supervised Temporal Action Localization
    Huang, Jing
    Kong, Ming
    Chen, Luyuan
    Liang, Tian
    Zhu, Qiang
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [4] ACTION RELATIONAL GRAPH FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION
    Cheng, Yi
    Sun, Ying
    Lin, Dongyun
    Lim, Joo-Hwee
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2563 - 2567
  • [5] Action Coherence Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Hua, Gang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1857 - 1870
  • [6] Weakly-Supervised Temporal Action Localization by Background Suppression
    Liu, Mengxue
    Gao, Xiangjun
    Ge, Fangzhen
    Liu, Huaiyu
    Li, Wenjing
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 7074 - 7081
  • [7] Weakly-supervised Temporal Action Localization by Uncertainty Modeling
    Lee, Pilhyeon
    Wang, Jinglu
    Lu, Yan
    Byun, Hyeran
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1854 - 1862
  • [8] AutoLoc: Weakly-Supervised Temporal Action Localization in Untrimmed Videos
    Shou, Zheng
    Gao, Hang
    Zhang, Lei
    Miyazawa, Kazuyuki
    Chang, Shih-Fu
    COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 162 - 179
  • [9] Background Suppression Network for Weakly-Supervised Temporal Action Localization
    Lee, Pilhyeon
    Uh, Youngjung
    Byun, Hyeran
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11320 - 11327
  • [10] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE Transactions on Image Processing, 2022, 31 : 5203 - 5213