Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection

被引：15

作者：

Li, Guoqiu ^{[1
]}

Cai, Guanxiong ^{[2
]}

Zeng, Xingyu ^{[2
]}

Zhao, Rui ^{[2
,3
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] SenseTime Res, Shanghai, Peoples R China

[3] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai, Peoples R China

来源：

COMPUTER VISION - ECCV 2022, PT IV | 2022年 / 13664卷

关键词：

Scale-aware; Weakly-supervised video anomaly detection; Spatio-temporal relation modeling;

D O I：

10.1007/978-3-031-19772-7_20

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent progress in video anomaly detection (VAD) has shown that feature discrimination is the key to effectively distinguishing anomalies from normal events. We observe that many anomalous events occur in limited local regions, and the severe background noise increases the difficulty of feature learning. In this paper, we propose a scale-aware weakly supervised learning approach to capture local and salient anomalous patterns from the background, using only coarse video-level labels as supervision. We achieve this by segmenting frames into non-overlapping patches and then capturing inconsistencies among different regions through our patch spatial relation (PSR) module, which consists of self-attention mechanisms and dilated convolutions. To address the scale variation of anomalies and enhance the robustness of our method, a multi-scale patch aggregation method is further introduced to enable local-to-global spatial perception by merging features of patches with different scales. Considering the importance of temporal cues, we extend the relation modeling from the spatial domain to the spatio-temporal domain with the help of the existing video temporal relation network to effectively encode the spatio-temporal dynamics in the video. Experimental results show that our proposed method achieves new state-of-the-art performance on UCF-Crime and ShanghaiTech benchmarks. Code are available at https://github.com/nutuniv/SSRL.

引用

页码：333 / 350

页数：18

共 50 条

[1] Video anomaly detection with spatio-temporal dissociation
Chang, Yunpeng
Tu, Zhigang
Xie, Wei
Luo, Bin
Zhang, Shifu
Sui, Haigang
Yuan, Junsong
PATTERN RECOGNITION, 2022, 122
[2] Spatio-Temporal AutoEncoder for Video Anomaly Detection
Zhao, Yiru
Deng, Bing
Shen, Chen
Liu, Yao
Lu, Hongtao
Hua, Xian-Sheng
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1933 - 1941
[3] LEARNING SPATIO-TEMPORAL RELATIONS WITH MULTI-SCALE INTEGRATED PERCEPTION FOR VIDEO ANOMALY DETECTION
Ye, Hongyu
Xu, Ke
Jiang, Xinghao
Sun, Tanfeng
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 4020 - 4024
[4] Video Relation Detection with Spatio-Temporal Graph
Qian, Xufeng
Zhuang, Yueting
Li, Yimeng
Xiao, Shaoning
Pu, Shiliang
Xiao, Jun
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 84 - 93
[5] Transformer with Spatio-Temporal Representation for Video Anomaly Detection
Sun, Xiaohu
Chen, Jinyi
Shen, Xulin
Li, Hongjun
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 213 - 222
[6] Spatio-Temporal United Memory for Video Anomaly Detection
Wang, Yunlong
Chen, Mingyi
Li, Jiaxin
Li, Hongjun
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 84 - 93
[7] Spatio-Temporal Unity Networking for Video Anomaly Detection
Li, Yuanyuan
Cai, Yiheng
Liu, Jiaqi
Lang, Shinan
Zhang, Xinfeng
IEEE ACCESS, 2019, 7 : 172425 - 172432
[8] Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning
Zhang, Menghao
Wang, Jingyu
Qi, Qi
Sun, Haifeng
Zhuang, Zirui
Ren, Pengfei
Ma, Ruilong
Liao, Jianxin
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 17385 - 17394
[9] Bidirectional Spatio-Temporal Feature Learning With Multiscale Evaluation for Video Anomaly Detection
Zhong, Yuanhong
Chen, Xia
Hu, Yongting
Tang, Panliang
Ren, Fan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8285 - 8296
[10] Spatio-temporal prediction and reconstruction network for video anomaly detection
Liu, Ting
Zhang, Chengqing
Niu, Xiaodong
Wang, Liming
PLOS ONE, 2022, 17 (05):

← 1 2 3 4 5 →