Progressive prediction: Video anomaly detection via multi-grained prediction

被引:1
|
作者
Zeng, Xianlin [1 ]
Jiang, Yalong [2 ]
Wang, Yufeng [2 ]
Fu, Qiang [2 ]
Ding, Wenrui [2 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, Beijing, Peoples R China
[2] Beihang Univ, Unmanned Syst Res Inst, Beijing 100191, Peoples R China
基金
北京市自然科学基金;
关键词
computer vision; unsupervised learning; video signal processing; video surveillance; IDENTIFICATION; ALGAE; NETWORKS;
D O I
10.1049/ipr2.13117
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video Anomaly Detection (VAD) has been an active research field for several decades. However, most existing approaches merely extract a single type of feature from videos and define a single paradigm to indicate the extent of abnormalities. A coarse-to-fine three-level prediction is built by integrating different levels of spatio-temporal representations, better highlighting the difference between normal and abnormal behaviors. First, an object-level trajectory prediction is proposed to model human historical position using a graph transformer network. Subsequently, skeleton-level prediction is achieved by incorporating the positional information from the trajectory prediction. More importantly, based on the predicted skeleton, a skeleton-guided pixel-level region prediction is performed. A novel Skeleton Conditioned Generative Adversarial Network (SCGAN) is designed to explore the correlation between skeleton-level and pixel-level motion prediction. Benefiting from SCGAN, the prediction of human regions is contributed by both coarse-grained and fine-grained motion features. This three-level prediction, namely Progressive Prediction Video Anomaly Detection (P3VAD), enlarges the prediction error on irregular motion patterns. Besides, a pixel-level analysis method is proposed to achieve Background-bias Elimination (BE) and denoise the predicted region. Experimental results validate the effectiveness of P3VAD on the four benchmark datasets (ShanghaiTech, CUHK Avenue, IITB-Corridor, and ADOC). This three-level prediction, namely Progressive Prediction Video Anomaly Detection (P3VAD), enlarges the prediction error on irregular motion patterns. This is the first effort to progressively combine three-level predictions from coarse to fine-grained for VAD. We demonstrate the effectiveness of our framework by conducting an extensive experimental evaluation on the four publicly large-scale benchmark datasets in both micro-AUC and macro-AUC metrics. image
引用
收藏
页码:2568 / 2583
页数:16
相关论文
共 50 条
  • [41] SURVEILLANCE VIDEO ANOMALY DETECTION WITH FEATURE ENHANCEMENT AND CONSISTENCY FRAME PREDICTION
    Zou, Beiji
    Wang, Min
    Jiang, LingZi
    Zhang, Yue
    Liu, Shu
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [42] Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering
    Shaoning Xiao
    Yimeng Li
    Yunan Ye
    Long Chen
    Shiliang Pu
    Zhou Zhao
    Jian Shao
    Jun Xiao
    Neural Processing Letters, 2020, 52 : 993 - 1003
  • [43] CrowdTelescope: Wi-Fi-positioning-based multi-grained spatiotemporal crowd flow prediction for smart campus
    Zhang, Shiyu
    Deng, Bangchao
    Yang, Dingqi
    CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2023, 5 (01) : 31 - 44
  • [44] Hierarchical Temporal Fusion of Multi-grained Attention Features for Video Question Answering
    Xiao, Shaoning
    Li, Yimeng
    Ye, Yunan
    Chen, Long
    Pu, Shiliang
    Zhao, Zhou
    Shao, Jian
    Xiao, Jun
    NEURAL PROCESSING LETTERS, 2020, 52 (02) : 993 - 1003
  • [45] Temporal refinement and multi-grained matching for moment retrieval and highlight detection
    Zhu, Cunjuan
    Zhang, Yanyi
    Jia, Qi
    Wang, Weimin
    Liu, Yu
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [46] A Fine Grained Quality Assessment of Video Anomaly Detection
    Zhou, Jiang
    McGuinness, Kevin
    Antony, Joseph
    O'Connor, Noel E.
    19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 29 - 35
  • [47] Self-trained prediction model and novel anomaly score mechanism for video anomaly detection
    Guo, Aibin
    Guo, Lijun
    Zhang, Rong
    Wang, Yirui
    Gao, Shangce
    IMAGE AND VISION COMPUTING, 2022, 119
  • [48] CrowdTelescope: Wi-Fi-positioning-based multi-grained spatiotemporal crowd flow prediction for smart campus
    Shiyu Zhang
    Bangchao Deng
    Dingqi Yang
    CCF Transactions on Pervasive Computing and Interaction, 2023, 5 : 31 - 44
  • [49] Text-video retrieval re-ranking via multi-grained cross attention and frozen image encoders
    Dai, Zuozhuo
    Cheng, Kaihui
    Shao, Fangtao
    Dong, Zilong
    Zhu, Siyu
    PATTERN RECOGNITION, 2025, 159
  • [50] Learning Appearance-Motion Synergy via Memory-Guided Event Prediction for Video Anomaly Detection
    Guo, Chongye
    Wang, Hongbo
    Xia, Yingjie
    Feng, Guorui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1519 - 1531