Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model

被引:9
|
作者
Wu, Peng [1 ]
Liu, Jing [2 ]
He, Xiangteng [3 ]
Peng, Yuxin [3 ]
Wang, Peng [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710060, Peoples R China
[2] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510555, Peoples R China
[3] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Video anomaly retrieval; video anomaly detection; cross-modal retrieval;
D O I
10.1109/TIP.2024.3374070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e.g., "vandalism", is superficial, since single labels are deficient to characterize anomalous events. In reality, users tend to search a specific video rather than a series of approximate videos. Therefore, retrieving anomalous events using detailed descriptions is practical and positive but few researches focus on this. In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e.g., language descriptions and synchronous audios. Unlike the current video retrieval where videos are assumed to be temporally well-trimmed with short duration, VAR is devised to retrieve long untrimmed videos which may be partially relevant to the given query. To achieve this, we present two large-scale VAR benchmarks and design a model called Anomaly-Led Alignment Network (ALAN) for VAR. In ALAN, we propose an anomaly-led sampling to focus on key segments in long untrimmed videos. Then, we introduce an efficient pretext task to enhance semantic associations between video-text fine-grained representations. Besides, we leverage two complementary alignments to further match cross-modal contents. Experimental results on two benchmarks reveal the challenges of VAR task and also demonstrate the advantages of our tailored method. Captions are publicly released at https://github.com/Roc-Ng/VAR.
引用
收藏
页码:2213 / 2225
页数:13
相关论文
共 50 条
  • [21] Contrastive Attention for Video Anomaly Detection
    Chang, Shuning
    Li, Yanchao
    Shen, Shengmei
    Feng, Jiashi
    Zhou, Zhiying
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4067 - 4076
  • [22] Drowning Detection Based on Video Anomaly Detection
    He, Xinyu
    Yuan, Fei
    Zhu, Yi
    IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 700 - 711
  • [23] TransAnomaly: Video Anomaly Detection Using Video Vision Transformer
    Yuan, Hongchun
    Cai, Zhenyu
    Zhou, Hui
    Wang, Yue
    Chen, Xiangzhi
    IEEE ACCESS, 2021, 9 : 123977 - 123986
  • [24] Video Event Restoration Based on Keyframes for Video Anomaly Detection
    Yang, Zhiwei
    Liu, Jing
    Wu, Zhaoyang
    Wu, Peng
    Liu, Xiaotao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14592 - 14601
  • [25] Overlooked Video Classification in Weakly Supervised Video Anomaly Detection
    Tan, Weijun
    Yao, Qi
    Liu, Jingfeng
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 212 - 220
  • [26] Self-trained prediction model and novel anomaly score mechanism for video anomaly detection
    Guo, Aibin
    Guo, Lijun
    Zhang, Rong
    Wang, Yirui
    Gao, Shangce
    IMAGE AND VISION COMPUTING, 2022, 119
  • [27] Ensemble anomaly score for video anomaly detection using denoise diffusion model and motion filters
    Wang, Zhiqiang
    Gu, Xiaojing
    Hu, Jingyu
    Gu, Xingsheng
    NEUROCOMPUTING, 2023, 553
  • [28] Domain generalization for video anomaly detection considering diverse anomaly types
    Wang, Zhiqiang
    Gu, Xiaojing
    Yan, Huaicheng
    Gu, Xingsheng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3691 - 3704
  • [29] Video anomaly detection algorithm based on effective anomaly sample construction
    Hou C.-P.
    Zhao C.-Y.
    Wang Z.-P.
    Tian H.-R.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2021, 51 (05): : 1823 - 1829
  • [30] Domain generalization for video anomaly detection considering diverse anomaly types
    Zhiqiang Wang
    Xiaojing Gu
    Huaicheng Yan
    Xingsheng Gu
    Signal, Image and Video Processing, 2024, 18 : 3691 - 3704