Spatio-temporal graph-based self-labeling for video anomaly detection

被引:0
|
作者
Xing, Meng [1 ,2 ]
Feng, Zhiyong [3 ]
Su, Yong [4 ]
Zhang, Yiming [3 ]
Oh, Changjae [5 ]
Gribova, Valeriya [6 ]
Filaretoy, Vladimir Fedorovich [6 ]
Huang, Deshuang [1 ,7 ]
机构
[1] Ningbo Inst Digital Twin, Eastern Inst Technol, 568 Tongxin Rd,Zhuangshi St, Ningbo 315201, Zhejiang, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, 96 JinZhai Rd, Hefei 230026, Anhui, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, 135 Yaguan Rd,Haihe Educ Pk, Tianjin 300350, Peoples R China
[4] Tianjin Normal Univ, Tianjin Key Lab Wireless Mobile Commun & Power Tra, 393 Binshui West Rd, Tianjin 300387, Peoples R China
[5] Queen Mary Univ London, Ctr Intelligent Sensing, Mile End Rd, London E1 4NS, England
[6] Russian Acad Sci, Inst Automat & Control Proc, Far Eastern Branch, Radio St 5, Vladivostok 690041, Primorsky Krai, Russia
[7] Shanghai East Hosp, Inst Regenerat Med, 150 Jimo Rd, Shanghai 200120, Peoples R China
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
VAD; ST-graph; Self-labeling; Not-normal space; Object-level criterion; ABNORMAL EVENT DETECTION; CONVOLUTIONAL NETWORKS;
D O I
10.1016/j.neucom.2025.129576
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video anomaly detection (VAD) aims to identify abnormal events in a video sequence. Existing methods achieve VAD by learning the decision boundary between the normal space and the abnormal space pre-defined in the training data. However, these methods trend to neglect the distribution gap between the pre-defined abnormal space and the real one, which lead to overfitting on the normal space or bias toward the pre-defined abnormal space. In this paper, we propose a spatio-temporal graph-based self-labeling method that not only focuses on the pre-defined abnormal space but considers the real abnormal space, enabling it to capture the decision boundary between the normal space and a complementary space, called as the not-normal space. We first construct a spatio-temporal graph (ST-Graph) based on the objects of input video and utilize a spatio-temporal graph convolution network (ST-GCN) to model the interaction between objects. We then propose a self-labeling- based learning mechanism that encourages the proposed ST-GCN to record the normal events while abstaining from labeling the pseudo-abnormal events, thereby aggregating the pre-defined and real abnormal spaces into not-normal space. To evaluate the model performance on localizing anomalous objects and capturing interactions between objects, we further introduce an object-level criterion that bridges frame-level and pixel- level criteria. Our method is validated on three datasets and achieves state-of-the-art frame-level AUC results on Avenue (92.5%), and outperforms existing ST-Graph-based methods on UCSD Ped2 (96.5%) and ShanghaiTech (76.8%).
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Video action detection by learning graph-based spatio-temporal interactions
    Tomei, Matteo
    Baraldi, Lorenzo
    Calderara, Simone
    Bronzin, Simone
    Cucchiara, Rita
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 206
  • [2] Spatio-temporal graph-based CNNs for anomaly detection in weakly-labeled videos
    Mu, Huiyu
    Sun, Ruizhi
    Wang, Miao
    Chen, Zeqiu
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [3] Spatio-Temporal Graph-based Semantic Compositional Network for Video Captioning
    Li, Shun
    Zhang, Ze-Fan
    Ji, Yi
    Li, Ying
    Liu, Chun-Ping
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [4] Video anomaly detection with spatio-temporal dissociation
    Chang, Yunpeng
    Tu, Zhigang
    Xie, Wei
    Luo, Bin
    Zhang, Shifu
    Sui, Haigang
    Yuan, Junsong
    PATTERN RECOGNITION, 2022, 122
  • [5] Spatio-Temporal AutoEncoder for Video Anomaly Detection
    Zhao, Yiru
    Deng, Bing
    Shen, Chen
    Liu, Yao
    Lu, Hongtao
    Hua, Xian-Sheng
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1933 - 1941
  • [6] Video Relation Detection with Spatio-Temporal Graph
    Qian, Xufeng
    Zhuang, Yueting
    Li, Yimeng
    Xiao, Shaoning
    Pu, Shiliang
    Xiao, Jun
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 84 - 93
  • [7] Graph-based spatio-temporal region extraction
    Galmar, Eric
    Huet, Benoit
    IMAGE ANALYSIS AND RECOGNITION, PT 1, 2006, 4141 : 236 - 247
  • [8] Transformer with Spatio-Temporal Representation for Video Anomaly Detection
    Sun, Xiaohu
    Chen, Jinyi
    Shen, Xulin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 213 - 222
  • [9] Spatio-Temporal United Memory for Video Anomaly Detection
    Wang, Yunlong
    Chen, Mingyi
    Li, Jiaxin
    Li, Hongjun
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 84 - 93
  • [10] Video anomaly detection based on spatio-temporal relationships among objects
    Wang, Yang
    Liu, Tianying
    Zhou, Jiaogen
    Guan, Jihong
    NEUROCOMPUTING, 2023, 532 : 141 - 151