Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model

被引:9
|
作者
Wu, Peng [1 ]
Liu, Jing [2 ]
He, Xiangteng [3 ]
Peng, Yuxin [3 ]
Wang, Peng [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710060, Peoples R China
[2] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510555, Peoples R China
[3] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Video anomaly retrieval; video anomaly detection; cross-modal retrieval;
D O I
10.1109/TIP.2024.3374070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e.g., "vandalism", is superficial, since single labels are deficient to characterize anomalous events. In reality, users tend to search a specific video rather than a series of approximate videos. Therefore, retrieving anomalous events using detailed descriptions is practical and positive but few researches focus on this. In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e.g., language descriptions and synchronous audios. Unlike the current video retrieval where videos are assumed to be temporally well-trimmed with short duration, VAR is devised to retrieve long untrimmed videos which may be partially relevant to the given query. To achieve this, we present two large-scale VAR benchmarks and design a model called Anomaly-Led Alignment Network (ALAN) for VAR. In ALAN, we propose an anomaly-led sampling to focus on key segments in long untrimmed videos. Then, we introduce an efficient pretext task to enhance semantic associations between video-text fine-grained representations. Besides, we leverage two complementary alignments to further match cross-modal contents. Experimental results on two benchmarks reveal the challenges of VAR task and also demonstrate the advantages of our tailored method. Captions are publicly released at https://github.com/Roc-Ng/VAR.
引用
收藏
页码:2213 / 2225
页数:13
相关论文
共 50 条
  • [41] DISCRIMINATIVE CLIP MINING FOR VIDEO ANOMALY DETECTION
    Sun, Li
    Chen, Yanjun
    Luo, Wu
    Wu, Haiyan
    Zhang, Chongyang
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2121 - 2125
  • [42] Background separation network for video anomaly detection
    Ye, Qing
    Song, Zihan
    Zhao, Yuqi
    Zhang, Yongmei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 6535 - 6551
  • [43] Towards Open Set Video Anomaly Detection
    Zhu, Yuansheng
    Bao, Wentao
    Yu, Qi
    COMPUTER VISION, ECCV 2022, PT XXXIV, 2022, 13694 : 395 - 412
  • [44] Video Anomaly Detection by Estimating Likelihood of Representations
    Ouyang, Yuqi
    Sanchez, Victor
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8984 - 8991
  • [45] Otoscopy Video Screening with Deep Anomaly Detection
    Wang, Weiyao
    Tamhane, Aniruddha
    Rzasa, John R.
    Clark, James H.
    Canares, Therese L.
    Unberath, Mathias
    MEDICAL IMAGING 2021: COMPUTER-AIDED DIAGNOSIS, 2021, 11597
  • [46] Video anomaly detection based on scene classification
    Hongjun Li
    Xulin Shen
    Xiaohu Sun
    Yunlong Wang
    Chaobo Li
    Junjie Chen
    Multimedia Tools and Applications, 2023, 82 : 45345 - 45365
  • [47] Video anomaly detection based on scene classification
    Li, Hongjun
    Shen, Xulin
    Sun, Xiaohu
    Wang, Yunlong
    Li, Chaobo
    Chen, Junjie
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (29) : 45345 - 45365
  • [48] Deep Video Anomaly Detection: Opportunities and Challenges
    Ren, Jing
    Xia, Feng
    Liu, Yemeng
    Lee, Ivan
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 959 - 966
  • [49] An informative dual ForkNet for video anomaly detection
    Li, Hongjun
    Wang, Yunlong
    Wang, Yating
    Chen, Junjie
    NEURAL NETWORKS, 2024, 179
  • [50] CamNuvem: A Robbery Dataset for Video Anomaly Detection
    de Paula, Davi D.
    Salvadeo, Denis H. P.
    de Araujo, Darlan M. N.
    SENSORS, 2022, 22 (24)