Weakly Supervised Video Anomaly Detection via Transformer-Enabled Temporal Relation Learning

被引:22
|
作者
Zhang, Dasheng [1 ]
Huang, Chao [2 ]
Liu, Chengliang [2 ]
Xu, Yong [2 ,3 ]
机构
[1] Chongqing Univ, Sch Artificial Intelligence, Chongqing 401135, Peoples R China
[2] Harbin Inst Technol, Shenzhen Key Lab Visual Object Detect & Recognit, Shenzhen 518055, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
国家重点研发计划;
关键词
Feature extraction; Transformers; Task analysis; Anomaly detection; Training; Surveillance; Training data; Deep learning; video anomaly detection; vision transformer; weakly-supervised learning;
D O I
10.1109/LSP.2022.3175092
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly supervised video anomaly detection is a challenging problem due to the lack of frame-level labels in training videos. Most previous works typically tackle this task with the multiple instance learning paradigm, which divides a video into multiple snippets and trains a snippet classifier to distinguish anomalies from normal snippets via video-level supervision information. Although existing approaches achieve remarkable progresses, these solutions are still limited in the insufficient representations. In this paper, we propose a novel weakly supervised temporal relation learning framework for anomaly detection, which efficiently explores the temporal relation between snippets and enhances the discriminative powers of features using only video-level labelled videos. To this end, we design a transformer-enabled feature encoder to convert the input task-agnostic features into discriminative task-specific features by mining the semantic correlation and position relation between video snippets. As a result, our model can make a more accurate anomaly detection for current video snippet based on the learned discriminative features. Experimental results indicate that the proposed method is superior to existing state-of-the-art approaches, which demonstrates the effectiveness of our model.
引用
收藏
页码:1197 / 1201
页数:5
相关论文
共 50 条
  • [21] Sequential attention mechanism for weakly supervised video anomaly detection
    Ullah, Waseem
    Ullah, Fath U. Min
    Khan, Zulfiqar Ahmad
    Baik, Sung Wook
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 230
  • [22] Event-driven weakly supervised video anomaly detection
    Sun, Shengyang
    Gong, Xiaojin
    IMAGE AND VISION COMPUTING, 2024, 149
  • [23] Weakly supervised video anomaly detection based on hyperbolic space
    Qi, Meilin
    Wu, Yuanyuan
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [24] A Convolutional Autoencoder Approach for Weakly Supervised Anomaly Video Detection
    Phan Nguyen Duc Hieu
    Phan Duy Hung
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 : 138 - 150
  • [25] BatchNorm-Based Weakly Supervised Video Anomaly Detection
    Zhou, Yixuan
    Qu, Yi
    Xu, Xing
    Shen, Fumin
    Song, Jingkuan
    Tao Shen, Heng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13642 - 13654
  • [26] Relabeling Abnormal Videos via Intra-Video Label Propagation for Weakly Supervised Video Anomaly Detection
    Thou, Wenhao
    Li, Yingxuan
    Zhao, Jiancheng
    Zhao, Chunhui
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1200 - 1205
  • [27] Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection
    Pu, Yujiang
    Wu, Xiaoyu
    Yang, Lulu
    Wang, Shengjin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4923 - 4936
  • [28] A Self-Paced Multiple Instance Learning Framework for Weakly Supervised Video Anomaly Detection
    He, Ping
    Li, Huibin
    Han, Miaolin
    APPLIED SCIENCES-BASEL, 2025, 15 (03):
  • [29] M2VAD: Multiview multimodality transformer-based weakly supervised video anomaly detection
    Paulraj, Shalmiya
    Vairavasundaram, Subramaniyaswamy
    IMAGE AND VISION COMPUTING, 2024, 149
  • [30] Weakly Supervised Temporal Action Detection With Temporal Dependency Learning
    Li, Bairong
    Liu, Ruixin
    Chen, Tianquan
    Zhu, Yuesheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4473 - 4485