Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model

被引:9
|
作者
Wu, Peng [1 ]
Liu, Jing [2 ]
He, Xiangteng [3 ]
Peng, Yuxin [3 ]
Wang, Peng [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710060, Peoples R China
[2] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510555, Peoples R China
[3] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Video anomaly retrieval; video anomaly detection; cross-modal retrieval;
D O I
10.1109/TIP.2024.3374070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e.g., "vandalism", is superficial, since single labels are deficient to characterize anomalous events. In reality, users tend to search a specific video rather than a series of approximate videos. Therefore, retrieving anomalous events using detailed descriptions is practical and positive but few researches focus on this. In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e.g., language descriptions and synchronous audios. Unlike the current video retrieval where videos are assumed to be temporally well-trimmed with short duration, VAR is devised to retrieve long untrimmed videos which may be partially relevant to the given query. To achieve this, we present two large-scale VAR benchmarks and design a model called Anomaly-Led Alignment Network (ALAN) for VAR. In ALAN, we propose an anomaly-led sampling to focus on key segments in long untrimmed videos. Then, we introduce an efficient pretext task to enhance semantic associations between video-text fine-grained representations. Besides, we leverage two complementary alignments to further match cross-modal contents. Experimental results on two benchmarks reveal the challenges of VAR task and also demonstrate the advantages of our tailored method. Captions are publicly released at https://github.com/Roc-Ng/VAR.
引用
收藏
页码:2213 / 2225
页数:13
相关论文
共 50 条
  • [1] Video Surveillance Anomaly Detection: A Review on Deep Learning Benchmarks
    Duja, Kashaf U.
    Khan, Izhar Ahmed
    Alsuhaibani, Mohammed
    IEEE ACCESS, 2024, 12 : 164811 - 164842
  • [2] APPROACHES TOWARD PHYSICAL AND GENERAL VIDEO ANOMALY DETECTION
    Kart, Laura
    Cohen, Niv
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1785 - 1789
  • [3] A Joint Sparsity Model for Video Anomaly Detection
    Mo, Xuan
    Monga, Vishal
    Bala, Raja
    Fan, Zhigang
    2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 1969 - 1973
  • [4] Online Video Anomaly Detection
    Zhang, Yuxing
    Song, Jinchen
    Jiang, Yuehan
    Li, Hongjun
    SENSORS, 2023, 23 (17)
  • [5] Anomaly Detection In Compressed Video
    Cavas, Sumeyye
    Beratoglu, Muhammet Sebul
    Toreyin, Behcet Ugur
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [6] Future Video Prediction from a Single Frame for Video Anomaly Detection
    Baradaran, Mohammad
    Bergevin, Robert
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT I, 2023, 14361 : 472 - 486
  • [7] Feature Prediction Diffusion Model for Video Anomaly Detection
    Yan, Cheng
    Zhang, Shiyu
    Liu, Yang
    Pang, Guansong
    Wang, Wenjun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5504 - 5514
  • [8] Anomaly detection in video sequences: A benchmark and computational model
    Wan, Boyang
    Jiang, Wenhui
    Fang, Yuming
    Luo, Zhiyuan
    Ding, Guanqun
    IET IMAGE PROCESSING, 2021, 15 (14) : 3454 - 3465
  • [9] An Algorithm for Semantic Vectorization of Video Scenes: Applications to Retrieval and Anomaly Detection
    Prashanth K.
    Kalra L.
    Kalidas Y.
    Kumar J.R.B.
    Ayyagari S.P.K.
    Deep A.
    SN Computer Science, 4 (1)
  • [10] Video anomaly detection with both normal and anomaly memory modules
    Zhang, Liang
    Li, Shifeng
    Luo, Xi
    Liu, Xiaoru
    Zhang, Ruixuan
    VISUAL COMPUTER, 2024, : 3003 - 3015