Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization

被引:0
|
作者
Mingwen Bi
Jiaqi Li
Xinliang Liu
Qingchuan Zhang
Zhenghong Yang
机构
[1] China Agricultural University,College of Information and Electrical Engineering
[2] Ministry of Agriculture and Rural Affairs,Key Laboratory of Agricultural Informatization Standardization
[3] Beijing Technology and Business University,National Engineering Research Center for Agri
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Weakly-supervised learning; Temporal action localization; Upper and lower limit loss; Action-aware network;
D O I
暂无
中图分类号
学科分类号
摘要
Weakly-supervised temporal action localization aims to detect the temporal boundaries of action instances in untrimmed videos only by relying on video-level action labels. The main challenge of the research is to accurately segment the action from the background in the absence of frame-level labels. Previous methods consider the action-related context in the background as the main factor restricting the segmentation performance. Most of them take action labels as pseudo-labels for context and suppress context frames in class activation sequences using the attention mechanism. However, this only applies to fixed shots or videos with a single theme. For videos with frequent scene switching and complicated themes, such as casual shots of unexpected events and secret shots, the strong randomness and weak continuity of the action cause the assumption not to be valid. In addition, the wrong pseudo-labels will enhance the weight of context frames, which will affect the segmentation performance. To address above issues, in this paper, we define a new video frame division standard (action instance, action-related context, no-action background), propose an Action-aware Network with Upper and Lower loss AUL-Net, which limits the activation of context to a reasonable range through a two-branch weight-sharing framework with a three-branch attention mechanism, so that the model has wider applicability while accurately suppressing context and background. We conducted extensive experiments on the self-built food safety video dataset FS-VA, and the results show that our method outperforms the state-of-the-art model.
引用
收藏
页码:4307 / 4324
页数:17
相关论文
共 50 条
  • [1] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
    Bi, Mingwen
    Li, Jiaqi
    Liu, Xinliang
    Zhang, Qingchuan
    Yang, Zhenghong
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4307 - 4324
  • [2] ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
    He, Bo
    Yang, Xitong
    Kang, Le
    Cheng, Zhiyu
    Zhou, Xin
    Shrivastava, Abhinav
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13915 - 13925
  • [3] Action Coherence Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Hua, Gang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1857 - 1870
  • [4] A Novel Action Saliency and Context-Aware Network for Weakly-Supervised Temporal Action Localization
    Zhao, Yibo
    Zhang, Hua
    Gao, Zan
    Gao, Wenjie
    Wang, Meng
    Chen, Shengyong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8253 - 8266
  • [5] Background Suppression Network for Weakly-Supervised Temporal Action Localization
    Lee, Pilhyeon
    Uh, Youngjung
    Byun, Hyeran
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11320 - 11327
  • [6] Feature Matching Network for Weakly-Supervised Temporal Action Localization
    Dou, Peng
    Zhou, Wei
    Liao, Zhongke
    Hu, Haifeng
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 459 - 471
  • [7] Deep cascaded action attention network for weakly-supervised temporal action localization
    Hui-fen Xia
    Yong-zhao Zhan
    Multimedia Tools and Applications, 2023, 82 : 29769 - 29787
  • [8] Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization
    Moniruzzaman, Md
    Yin, Zhaozheng
    He, Zhihai
    Qin, Ruwen
    Leu, Ming C.
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2166 - 2174
  • [9] Deep cascaded action attention network for weakly-supervised temporal action localization
    Xia, Hui-fen
    Zhan, Yong-zhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29769 - 29787
  • [10] ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization
    Yang, Zichen
    Qin, Jie
    Huang, Di
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3090 - 3098